I have got ASCII files and want to convert them into maybe excel or tab/csv delimited text file. The file is a table with field name and field attributes. It also includes index name, table name and field(s) to index if required depending on the software. I don't think it is necessary to think of this. Well, field name and field attributes are enough, I hope so. I just want the information hidden inside. Can you all experts help me to get this done.
The lines are something like this:
10000001$"WORD" WORD$10001890$$$$495.7$$$N$$
10000002$11-word-word word$10000002$$$$$$$Y$$
10000003$11-word word word$10033315$0413004$$$$$$N$$
10000004$11-word word word$10033315$$$$$$$Y$017701$
The general answer, before knowing your ascii file in details, operating system, and so on, would be:
1 - cut the top n-lines, that containg the information you don't want. Leave the filds names, if you want to.
2 - check if the fields are separated by a common character, for example, one comma ,
3 - import the file inside a spreadsheet program, like Excel or OpenOffice Calc. In OOCalc, choose to import the file, then select the correct separating character
that's all.
Related
I have a file I am reading into a blob via datafactory.
Its formatted in excel. Some of the column headers have special characters and spaces which isn't good if want to take it to csv or parquet and then SQL.
Is there a way to correct this in the pipeline?
Example
"Activations in last 15 seconds high+Low" "first entry speed (serial T/a)"
Thanks
Normally, Data Flow can handle this for you by adding a Select transformation with a Rule:
Uncheck "Auto mapping".
Click "+ Add mapping"
For the column name, enter "true()" to process all columns.
Enter an appropriate expression to rename the columns. This example uses regular expressions to remove any character that is not a letter.
SPECIAL CASE
There may be an issue with this is the column name contains forward slashes ("/"). I accidentally came across this in my testing:
Every one of the columns not mapped contains forward slashes. Unfortunately, I cannot explain why this would be the case as Data Flow is clearly aware of the column name. It can be addressed manually by adding a Fixed rule for EACH offending column, which is obviously less than ideal:
ANOTHER OPTION
The other thing you could try is to pre-process the text file with another Data Flow using a Source dataset that has no delimiters. This would give you the contents of each row as a single column. If you could get a handle on the just first row, you could remove the special characters.
I'm currently doing a crawler for a website, and my goal is to have a CSV, with a name in the first column and an image the second one, which is inserted with a Ruby script using the CSV#open method.
I have already used this method but I don't know, and I don't find information about the problematic that is to insert an image into a column.
Is it really possible? If not, which functionality would you use to have a list with string + image after crawling?
A CSV (Comma Separated Values) file is a TEXT file which as the name implies has various values separated by commas, expressed using plain ASCII, or sometimes unicode. It is intended as a light weight way to transfer tabular data between different computer systems or programs. You can use it to spit out a table in a database, or the VALUES in something like a spreadsheet. The normal convention is for the first row(line) of the file to contain names or labels that represent what that column contains, and then data in the subsequent rows.
As such, there really is no practical way to embed an image within a CSV file. This is not a limitation of Ruby or Watir, but a limitation of textfiles which spans pretty much all languages and operating systems.
To do what you want you would be better off to save the images into a specific directory using unique filenames and insert those filenames into the CSV file.
I have this csv file that I would like to parse with Ruby. The file's data is a cluster with commas and new lines in the fields but Excel still reads it properly. If the file could be exported from excel using the unit and record separators as the delimiters for the columns and rows, I'd be golden.
Anybody know how to specify those characters in excel? Thanks!
Use Ruby CSV with this option:
:col_sep
The String placed between each field. This String will be transcoded
into the data’s Encoding before parsing.
See more here: http://ruby-doc.org/stdlib-2.0.0/libdoc/csv/rdoc/CSV.html
I ended up having Google Sheets export the file as json. Steps I followed here There were 10,000 records and the browser tab crashed when it tried to do all of them. So I had to piece meal it. I'm sure there's a better way to do it.
I am retrieving data from DB and make each field merged with Comma between them to generate CSV.
But the problem is one the field is Company Name and the data includes comma which leads to malformed CSV file.
Example: Name, Telephone, Email
AAA, 12345, aaa#mail.com
BBB Co,.Ltd, 43466, bbb#gmail.com
For the record BBB the generated CSV becomes problem as it includes , in the data.
How should I make the correct CSV for such records of including , ?
Most of the developers handle this situation by using different characters instead of "comma". But i would suggest you to look into an old post here
Dealing with commas in a CSV file
Is your question related to Salesforce APEX?
When the CSV was generated there ought to be an option to enclose the fields in Double quotes so that commas can appear inside the field content. For example "Company, Name","1234","etc."
The CSV generator will also "escape" any double quotes inside a field like this "Some field with \"double\" quotes","123","etc"
This all means you need a CSV parser that can handle these situations.
If your question is related to Salesforce APEX then it is quite difficult to build such a CSV parser because of the limitations Salesforce imposes on the number of statements that can run in any given action.
I have a list of names (never over 100 names) with a value for each of them, either 3 or 4 digits.
john2E=1023
mary2E=1045
fred2E=968
And so on... They're formatted exactly like that in the .txt file. I have Python and Excel, also willing to download whatever I need.
What I want to do is sort all the names according to their values in a descending order so highest is on top. I've tried to use Excel by replacing the '2E=' with ',' so I can have the name,value then important the data so each are in separate columns but I still couldn't sort them any other way than A to Z.
Help is much appreciated, I did take my time to look around before posting this.
Replace the "2E=" with a tab character so that the data is displayed in excel in two columns. Then sort on the value column.