- Posts: 63
excel to sqlite3
- brian clark
- Topic Author
- Offline
- User
I have excel files and a folder with the same name full of html files
file 1.xlsx
folder is file 1
containing html files like this
original_https_0carbonkwh.co.uk_.html
original_https_0carbonkwh.com_.html
in the excel file the column originalurl contains the domains
https://0carbonkwh.co.uk
https://0carbonkwh.com
What i am trying to do is go from excel, compare the domain name (column 1) originalurl to the html file name (with temp replacement to match the same domain name and insert that with the excel data 89 columns in total.
Not sure if the rawhtml (last column can use C sharp to read the column row, read the folder and its html files. if matching html file name is found insert into the rawhtml column then inserting the rest of the row contents.Any help here would be welcomed I am not sure about purchasing it yet unless it can do the above.
Cheers
Brian Clark
ps, i found a bug with the excel adaptor if a sheet has a empty space at the end of the name it brings back an error
i had "1 - " renamed it to "1" and now it works. y ou should do exact matches when reading sheet names including white spaces
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
Please Log in or Create an account to join the conversation.
- FlowHeater-Team
- Offline
- Admin
Thanks for this interesting inquiry 😊
Please find attached a brief example including a SQLite3 database to run with the SQLite Adapter on the WRITE site.
- First you have to build the file name and path to the file. For this I´m using the
System Heater
to retrieve the current READ Excel worksheet name. With the
Path Heater
I´m extracting just the Excel worksheet name without the extension .xlsx. The with the
String Replace Heater
and the
String Append Heater
the new expected path and file name are created.
- To check whether the file is available you need the help of the
.NET Script Heater
and a little C# script. In case the file is not exists, the
Filter Heater
will be skipped/filter out this line for further processing.
- With the File Heater you could now read/load the HTML file from the Certain folder and import them to the SQLite blob field. Please note, you have to switch the FlowHeater DataType from RAW to STRING.
That´s all, hope you enjoy 😊
Best wishes
Robert Stark
Did this answer your question? We would be grateful if you provide a brief comment as feedback. It may also help others who may have encountered a similar problem.
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
testing it out, 315 million records to go ;0)
You are a gent
ironically I found you via our own search engine :0)
Cheers
Brian
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
I tried the 32bit crashes around 19834 record here.
Will retry and have changed the db html to text instead of blob
Please Log in or Create an account to join the conversation.
- FlowHeater-Team
- Offline
- Admin
That´s strange. Usually the Excel Adapter works with both 32/64 Bit Office/Excel Versions. You could mix the FlowHeater and Office/Excel Version like you want. e.g. 32-Bit FlowHeater and 64-Bit Excel, … For more analysis please share your used version and the occurred error messages, Thanks.
Related to your crash) Please share also the displayed error message. In case a file %TEMP%\flowheater.log exists please post this to check why the crash occur. Thanks.
In the attachment you´ll found a short test definition. This inserts 100.000 HTML files to the SQLite Database. In my opinion the import crashes related to some kind of special characters into your processed HTML files? In case you´ve got a detailed error message please share this to fix this issue.
Best wishes
Robert Stark
Did this answer your question? We would be grateful if you provide a brief comment as feedback. It may also help others who may have encountered a similar problem.
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
Testing now
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-120668807-1">
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-120668807-1');
</script>
Is it best to use replace or regex in terms of speed
Please Log in or Create an account to join the conversation.
- FlowHeater-Team
- Offline
- Admin
the sign ' is already handled by the SQLite Adapter . I guess some strange UTF-8 character throws this issue. To you know the certain HTML file who raise this error? If yes may I ask you to share this here or send me the file via Email, Thanks.
Of course Regex would be faster but the String Replace Heater is much easier to use
Best wishes
Robert Stark
Did this answer your question? We would be grateful if you provide a brief comment as feedback. It may also help others who may have encountered a similar problem.
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
incorrect string type for HTML
not sure if its a language issue i do have many html files in every language. how do i store it accounting for this?
I think your end is fine just mysql setup is off.
Cheers
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
no idea. utf8 or something not sure if its
Please Log in or Create an account to join the conversation.
- FlowHeater-Team
- Offline
- Admin
Best wishes
Robert Stark
Did this answer your question? We would be grateful if you provide a brief comment as feedback. It may also help others who may have encountered a similar problem.
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
there are no examples
i think it would be neat to add some replace examples as i am completely clueless on how regex works in your software
i am using utf8 as standard
string output and correct settings untouched in flowheater
65001, utf-8,Unicode (UTF-
text string output
for regex i am using just and replace with ''
is this correct?
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
so tried this same error
what am i doing wrong here with mysql
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
ive tried standard utf8 and general ci
utf8mb4 and utf8mb4_general_ci
Please Log in or Create an account to join the conversation.
- FlowHeater-Team
- Offline
- Admin
then you have to use the following settings into the SQLite Adapter, see Screenshot below.
Enable: Use UNICODE strings
Set the encoding to UTF-16
Best wishes
Robert Stark
Did this answer your question? We would be grateful if you provide a brief comment as feedback. It may also help others who may have encountered a similar problem.
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
not sure if unicode option on is a good or bad idea
but so far so good. its writing now. so either its blank data or something in the files or missing html files altogether
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
can you show how in regex to remove all of them in a page or replace with double single quotes
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
any ideas on how to remove them or at least save their contents in english while removing them or just allowing to store them.
so far everything is writing except 1100 errors out of 100k sites.
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
/////////////////////////////////////////////
// //
// //
// + + //
// +++ +++ //
// +++++++++ //
// +++++ /\ ///
// +++ / \ / //
// + / \ / //
// / \ / //
// / \_________/ //
// / //
// / \ / //
// / __\ /__ //
// / //
// / ____ //
// / \ / //
// / _~_ \/ _~_ //
// / / / | | //
// / / / | | //
// / / / | | //
//////////////////////////////////////////-->
Please Log in or Create an account to join the conversation.
- FlowHeater-Team
- Offline
- Admin
I´m also able to import your copy & pasted HTML code into my SQLite database, see screenshot below. Please note: The SQLite Adapter takes care of the most special characters (e.g. ‘ “ \, and so on) during the import process. I guess there is another sign inside your document who throws the issue? And of course must be fixed. Could you please add your certain document and created FlowHeater Definition into a ZIP archive and post it here, Thanks.
Best wishes
Robert Stark
Did this answer your question? We would be grateful if you provide a brief comment as feedback. It may also help others who may have encountered a similar problem.
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
but trying out csv to csv..... is there a warning when i have reached 100k?
I cant seem to get any writes.
Will purchase tomorrow. Pretty good software but needs some copy functions in the end results both for errors and results.
Please Log in or Create an account to join the conversation.
- FlowHeater-Team
- Offline
- Admin
I see the heart and I´m able to import them any time, see my previous screenshot. I could only use your copy&past content that’s not the original file.
The warning is only for information. After processing the Designer just show max. 100.000 records. The csv file is written without limitation.
For automation and error handing I´ll recommend the FlowHeater Server .
Best wishes
Robert Stark
Did this answer your question? We would be grateful if you provide a brief comment as feedback. It may also help others who may have encountered a similar problem.
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
Please Log in or Create an account to join the conversation.
- brian clark
- Topic Author
- Offline
- User
- Posts: 63
Please Log in or Create an account to join the conversation.
- FlowHeater-Team
- Offline
- Admin
Please do not mix separate questions into one topic and open for each question a separate topic her in the Support Forum, Thanks.
JSON and MongoDB) For MongoDB you need an additional ODBC driver to connect the database. Afterwards you could mix each FlowHeater Adapter like you want.
CSV and UTF8) The TextFile Adapter reads/writes every time the correct selected codepage, believe me Please open a separate topic and post a bit more information what you have done and what´s the result. An example would be helpful.
Best wishes
Robert Stark
Did this answer your question? We would be grateful if you provide a brief comment as feedback. It may also help others who may have encountered a similar problem.
Please Log in or Create an account to join the conversation.