Kantu IDE extract multiple lines

Abderrahman_Elghazi · July 19, 2018, 7:33am

Hi,
I need to extract some paragraphs on an html page that contain multiple lines and store them in a csv. Then I want to perform a join on all the lines. So far I have used sourceExtract however it is not suitable for all the paragraphs I want to retrieve (there is no definite regex that matches the paragraph I want). Is there any other easier and faster way to do that ?
I should add that when using storeText I only get one line at a time or I get the very last character only. Also, the whole paragraph isn’t contained in one p.
Thanks,

ulrich · July 19, 2018, 9:35am

A link to the website (or maybe a screenshot of the website) would be helpful for us to suggest a solution.

MarkHunnibell · August 6, 2018, 1:19pm

I have the same/similar question. Basically, I want the macro to go to a web page and then export the source HTML to a file. I understand that, for reasons I do not comprehend, Kantu can only output as a CSV file, and I have been able to get it to do that… sort of. BUT it only saves the first line of the HTML file. How do I get the script to save ALL the lines in the HTML file to the CSV file? Below is the macro I have that exports and saves the first line (the reference to CNN is just a placeholder).

{
  "CreationDate": "2018-8-6",
  "Commands": [
    {
      "Command": "open",
      "Target": "https://www.cnn.com/",
      "Value": ""
    },
    {
      "Command": "waitForPageToLoad",
      "Target": "",
      "Value": ""
    },
    {
      "Command": "sourceExtract",
      "Target": "*",
      "Value": "blob"
    },
    {
      "Command": "store",
      "Target": "${blob}",
      "Value": "!csvLine"
    },
    {
      "Command": "csvSave",
      "Target": "cnnHTML",
      "Value": "*"
    },
    {
      "Command": "localStorageExport",
      "Target": "cnnHTML.csv",
      "Value": ""
    }
  ]
}