[H-GEN] Converting csv file to tab delimited text file

Michael Anthon michael at anthon.net
Wed Jun 7 01:58:53 EDT 2006


>
> can only process a Tab Delimited text file, so I have to open the file in
> excel and save it as Tab Delimited text file. Anyone has any alternative
> to
> automate this process? Please comments.
>

I have to do a fair bit of text data file manipulation myself.  Keep in mind
the escaping issues as someone else mentioned.  Also keep in mind that the
MS idea of CSV is generally broken and even changes from package to package
(Access handles CSV slightly differently to the way Excel does).  I'd go so
far as to say using Excel as an intermediary will eventually lead to tears.
It always does for me [1]

The biggest issue I have to deal with is the way it handles numbers.
Specifically I deal with numbers that have a leading zero that is important
to retain.  Excel, unless you are careful, will think, "Hey, that's a
number, I'll convert it to an integer for you".  I'm ranting, sorry.

In my books, if a CSV file has a quoted field, that field should be treated
as a string, not a number.

Getting back to the point, IF you have complete control over the source data
and IF you can guarantee that the contents of a field never need escaping,
and IF you can guarantee that there will never be any commas inside the
fields then you can use simple regex with sed or maybe awk to reformat the
file from CSV to TDT.  If you can't guarantee the above then you need to
process the file correctly by parsing the data and writing it out again.

Processing CSV is not hard but there is also perl and python modules that
can do all the hard work for you.

In short, be wary of very simple solutions (such as sed/awk) if you can't
control the source data.  Be even more wary of Excel.

End of rant.

Cheers,

Michael

[1] not my tears mind you but the other people that give me butchered files,
which leads to other complications when people complain about me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.humbug.org.au/pipermail/general/attachments/20060607/1d25adf2/attachment.html>


More information about the General mailing list