[H-GEN] sed query

Jason Parker-Burlingham jasonp at uq.net.au
Tue May 6 09:40:13 EDT 2003


[ Humbug *General* list - semi-serious discussions about Humbug and     ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]

"Scott Pullen" <spullen at optusnet.com.au> writes:

> I have received a file at work that must be the work of a mad genius.  They
> have joined a large number of reasonably small (3 kb) ascii text files
> together in such a way that all contents are on one line.  more seems to
> handle the file and wrap the lines and displays the file.  vi reports that
> the line is too long and won't display.

If the placement of the newlines you want inserted is not important
you can probably just use fmt(1):

	$ wc scrlt11.txt 
	      1   88612  508856 scrlt11.txt
	$ fmt scrlt11.txt | wc 
	   7160   88612  508086

Using GNU sed I was able to do the following:

	$ wc scrlt11.txt 
	      1   88612  508856 scrlt11.txt
	$ sed --version
	GNU sed version 4.0.5
	[..]
	$ sed -re's/(.{65}[^ ]*) /\1\n/g' scrlt11.txt | wc
           7384   88612  508856

The -r option turns on extended regular expressions so that the {65}
quantifier will work.  This is undoubtedly an evil GNU extension, so
here is the same thing in a language you can depend on:

        $ perl -e'$_ = <>;
                  while(m/(.{0,65}\S*)\s*/g) {
                      print "$1\n"
                  }' scrlt11.txt

My guess is that your sed pattern is not matching your input, and that
sed is then reading a line, noticing it does not match, and writing
the string unchanged to the output.

If you could post a sample of the input and a copy of the sed script
you are using we would be able to tell for sure.

jason
-- 
``Oooh!  A gingerbread house!  Hansel and Gretel are set for life!''

--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'.  See http://www.humbug.org.au/



More information about the General mailing list