seders newsletter #6


From: Al Aab <af137@freenet.toronto.on.ca>>
Sent: 17 September 1996 06:53

=========================== seders newsletter #6 ===============================

1. In an upcoming newsletter, you will probably see your first sed script
   of the second order/degree. The problem to be solved thusly has been
   suggested in a previous newsletter. It is repeated here. Remember the
   book chapters ... ? Some ill-known utilities will be mentioned.
2. In an upcoming newsletter, you will probably see your first sed script
   to do grammar. A simple parser.
3. Anyone knows how to set up a sed newsgroup?
4. Anyone wants to become the moderator?
5. Anyone willing to offer seders a web home?
6. For now, here are some Usenet extracts & sed stuff. You are welcome to
   send me your comments thereon.
   NB: Most of the following is not Al Aab's. Just quotes. Excuse the coming
       mess, there are jewels. If you cannot see them, contribute some.

                                   - + -

I have seen a couple of examples of sed with a regexp extending over two
lines, but I didn't understand them.  I would like to have one that is
about as trivial as possible.  Quoted-printable messages show up in my
elm with lines cut at 76 characters, an = sign appended, and the
remaining characters on the next line.  Yes, there are numerous mailers
that avoid the problem, but I'd like to see how to fix it in sed.  When
the final character on a line is =, delete it and the newline.
Conceptually it is:

sed 's/=
//'

but that's garbage to sed.  Is there a simple way to do this with sed?

--------------------------------------------------------------------------------

In article <01bb858b$373ccb20$235d6cce@default>, Dwayne Moore
 wrote:

>I am trying to use SED to replace the pattern ZZZZZZ with the current date
>of the form YYMMDD by doing the
>following:
>
>DATE=`date '+%y%m%d'`
>sed 's/ZZZZZZ/$DATE/g' sourcefile.C > newsourcefile.C
>
>but the pattern ZZZZZZ is replaced with $DATE instead of the actual date.
>
>Can anyone shed some light on why this doesn't work?
>
>Thanks In Advance,
>Dwayne Moore.

Change the single quote to double quote and it will work.

sed "s/ZZZZZZ/$DATE/g" sourcefile.C > newsourcefile.C

The double quote will allow varible expansion.


  richk

--------------------------------------------------------------------------------

avdmeer <ameer@ucc.nl> wrote:
: Can anybody tell me how I can use ex to do things like this:
: 1. Replace a certain string in 100 files with another string ?
: 2. Replace a certain string in 100 files with another string only when 
: another certain string is present in the file ?

For item 1. above.

Where the list of the 100 files is in a list called "list".

for i in `cat list`
do
    sed -e "s/old_string/new_string/" $i > $i.x
    mv $i.x $i
done

For item 2. above.

grep -l "another_string" * >list   #will make a list of file
                                   #containg another_string
Then run the above script.

--------------------------------------------------------------------------------

Hi:

I am trying to use awk in my c shell script, but the $0 in awk
(which stands for the whole line) is mistakenly interpreted as
the name of c shell script.

how can i make it correctly understood?

My script is included below:

foreach line ("`awk 'NR>3 {print $0}' lic.dat`")
  if ("`grep $line $SPW_SYS_DB/site_data/license/license.dat.123`" !=
"") then
    echo "#$line" >> $SPW_SYS_DB/site_data/license/license.dat
  else
    echo "$line" >> $SPW_SYS_DB/site_data/license/license.dat
  endif
end

***********

Another question, I tried using 
foreach line ("`sed -n '3,$p' lic.dat`")
but c shell will try to interprete $p as a variable :(

how can i get around these??? please don't ask me to switch to
other shells, i want to know how to do it in c shell.

thanks.

Charlie

--------------------------------------------------------------------------------

In article <50db35$bkr@epervier.CC.UMontreal.CA>,
CHAN TANG Eric-Aubert <chantane@JSP.UMontreal.CA> wrote:
: I need to get the text enclosed within two specific string (in SH script).
: For example, the command should work with all the following examples,
: being able to get what's inside '<p>' and '</p>'.
: 
: * Example 1:
: <p>Text body is here</p>
: 
: * Example 2:
: <p>Text
: body
: is
: here</p>
: 
: * Example 3:
: <p>
: Text body
: is here
: </p>
: 
: You get the point. The command should output 'Text body is here'.
: 
: I think this could be done using 'awk', but I shamely don't know how to do
: this. So if you have any clue, I'd really appreciate your help.

Why not just use lynx -dump?

--Dave

--------------------------------------------------------------------------------

You aren't considering the quoting properly. There are three levels of
escapes in the regex that you feed to grep. The first is gobbled by
csh in forming the alias, the second is gobbled by csh in executing
the alias, and the third is in grep itself. thus:

    \[\ \   ~\]
         ^this is a tab

should work within your single quotes. Or you can mess about with $IFS.
Someone (was it Brian Hiles?) posted an overview of shell quoting a
while ago which might be useful to you.

            Pete


In article <DwzFo9.F0H@cadence.com>, John Gianni <jjg@cadence.com> wrote:
>Using csh, the following:
>      alias sk 'grep \~\>\!:1\[abc~\] \!:2*'
>      sk foo *.il
>
>Will catch all instances of:
>      ~>fooa
>      ~>foob
>      ~>fooc
>      ~>foo~
>
>But, I am having trouble replacing abc in the expression above with white space.
>
>The simple things didn't work, e.g.: 
>     \[\t\n~\]
>     \[ ~\] 
>     \[" " \]
>     \[' ' \]
>     etc.
>
>Any clues?

--------------------------------------------------------------------------------

Thread 29 of 39 (page 2):  Re: .                sed script needed
        HOW TO EXTRACAT CERTAIN CAHPTERS FROM A BOOK
        EACH CHAPTER HAS A LINE LIKE ^-
        FOLOWED BY A CHAPTER NUMBER/DESIGNATION
             (2 CONSECUTIVE LINES)

/bin/sed \
'#n
# make sure it's a section number on its own line, followed by three dashes
/^[0-9]\{5\}\.b$/N
/^[0-9]\{5\}\.b\n---$/ {
        # slurp up the section-header+body first
        $!N
        : loop
        /^.*\n---$/!b skip
        $!N
        b loop
        : skip
        # since the sections need to be considered as _following_ the
        # section header and following on _and including_ the terminating
        # three dashes, if you so desire we can add them back:
        /PATTERN/ {
                a\
---
                p
        }
}' sedtest

Though a "quick and dirty" solution would be to manually or by script
juxtaposition the second "---" and section number so a sed script like:

/^---$/,/^---$/ { ... }
(XXXX WRONG XXXX AL AAB)


would be possible. Perhaps by a two-pass sed solution.

P.S. If it were not for the fact that I have the worlds' most hairy
"sh" script comprising my sed debugger on my SparcStation back home
_without any floppy disk_ I could post it to Seders. Give me a few days
to work it out....

-Brian "'Sed Anonymous' 12-Step Program Graduate" Hiles
--
   ,---.     ,---.     ,---.     ,---.     ,---.     ,---.     ,---.
  /  _  \   /  _  \   /  _  \   /  _  \   /  _  \   /  _  \   /  _  \
.'  / \  `.'  / mailto:bsh20858@challenger.atc.fhda.edu \  `.'  / \  `.

--------------------------------------------------------------------------------

Thu, 08 Aug 1996 13:40:03     comp.unix.questions           Thread 35 of 129
Lines 31                      awk,sed and csh question      1 Response
xiaoyi@altagroup.com          Charlie Wu at Alta Group/Cadence, Sunnyvale, CA.

Hi:

I am trying to use awk in my c shell script, but the $0 in awk
(which stands for the whole line) is mistakenly interpreted as
the name of c shell script.

how can i make it correctly understood?

My script is included below:

foreach line ("`awk 'NR>3 {print $0}' lic.dat`")
  if ("`grep $line $SPW_SYS_DB/site_data/license/license.dat.123`" !=
"") then
    echo "#$line" >> $SPW_SYS_DB/site_data/license/license.dat
  else
    echo "$line" >> $SPW_SYS_DB/site_data/license/license.dat
  endif
end

--------------------------------------------------------------------------------

Thu, 08 Aug 1996 10:17:22     comp.unix.questions           Thread   24 of  129
Lines 52       Re: Sed and streams (need help on search/repl1 Response
mcclellantj@harrier13  Tad McClellan at Lockheed Martin Tactical Aircraft Syste

Paul D. Short (ps14004@nyssa.swt.edu) wrote:
: I am new to Unix and am trying to get SED to
: textually replace a string in multiple files.
: However, I can't get sed to modify the original
: files.

: For example, if I do this:

: sed 's/old_string/new_string/g' *.html > *.html
                                           ^^^^^^

You can't do this.

The shell expands wildcards _before_ calling sed. If you have 3 html
files (one.html two.html three.html) then this is the same as:

sed 's/old_string/new_string/g' one.html two.html three.html >one.html two.html
three.html

--------------------------------------------------------------------------------

Thread 24 of 129 (page 2):  Re: Sed and streams (need help on search/replace)

???

You need to apply the substitutions into a _different_ file (one file
at a time), then copy the different file over the original. In csh:


# WARNING: this blows away the original file !!!

foreach file (*.html)
? sed 's/old_string/new_string/g' $file >$file.tmp
? mv $file.tmp $file
? end


: I'll get an error saying file 'whatever' exists.
: I'm thinking this error stems from my misuse (or
: lack of knowledge) of stream commands.
                        ^^^^^^^^^^^^^^^

--------------------------------------------------------------------------------

Thread 24 of 129 (page 3):  Re: Sed and streams (need help on search/replace)

lack of knowledge of how shells work, would be more accurate.


: Can anyone point me in the right direction?

Perl !

perl -i -e 's/old_string/new_string/g' *.html

--------------------------------------------------------------------------------

Sun, 11 Aug 1996 01:59:14     comp.unix.questions           Thread   24 of  129
Lines 28       Re: Sed and streams (need help on search/replRespno   1 of   1
benton@noc.tor.hookup.net  Dan Murray at HookUp Communication Corporation, Oakv

In article <alank-0708961454200001@news.mjr.com>,
Alan H. Katz <alank@mjr.com> wrote:
>In article <1996Aug7.122258@nyssa.swt.edu>, ps14004@swt.edu wrote:
>
>> For example, if I do this:
>>
>> sed 's/old_string/new_string/g' *.html > *.html

Try:
foreach i (*.html)
sed 's/old_string/new_string/g' $i > $i
end

This assumes that you are using /bin/csh as your shell.

>Generally you can not use a wildcard in a destination file name. You
>should redirect to a file and then cp that to the original (assuming that

--------------------------------------------------------------------------------

> IN OTHER WORD :  BEAUTIFY THE INPUT FILE BY RMOVING ALL THE BLANKS BETWEEN
> THE DOUBLE-QUOTE PAIR.  USE A GENERAL SED SCRIPT.

here's my solution, dsq.sed:
#!/usr/bin/sed -f
# sed script to eliminate spaces 'tween quotes and upcase everything
# 'tween quotes.
# NOTE: doesn't handle \" or odd numbers of quotes right.

# put a \n at the begining of the pattern space
s/^/\
/

:line_loop
# if there's no quotes left, we're done with this line
/\n.*"/!bend

# skip non-quoted chars before the next quoted region
s/\(.*\)\(\n\)\([^"]*\)/\1\3\2/

# wrap the next quoted region in \n's
s/\(\n\)\("[^"]*"\)/\1\2\1/

# save it for later use
h

# grab the quoted region
s/[^\n]*\n\(.*\)\n.*/\1/

# do the changes
s/ //g
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/

# add it to the end of the hold space
H

# grab the hold space
g

# replace the old quoted region with the new one
s/\([^\n]*\)\(\n\)[^\n]*\n\([^\n]*\)\n\(.*\)/\1\4\2\3/

bline_loop

:end
s/\n//

>                         SED PROBLEM # 2
>
> GIVEN A BOOK DIVIDED INTO BIG CHAPTERS, EACH CHAPTER MAY BE TOO BIG TO
> FIT IN SED'S
>...
> s/\(.*\)\(\n\)\([^"]*\)/\1\3\2/
> s/[^\n]*\n\(.*\)\n.*/\1/
>...

Using a newline character as a marker is a good trick, since a newline
is the one character that can't occur in the line.  I believe [^\n] is
non-portable however, which is one drawback of using a newline vs some
other odd character (and you can't portably put a real newline inside
a character class either, alas).  I also use the \(\n\)...\1 trick
to avoid breaking the s/// across two lines.

> y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/

Here's a related tidbit that comes in handy at times.  It's a
lesser-known "third way" to translate case in sed, using a table
lookup technique.  It allows you to change just part of a line without
going through contortions with the hold space and without using 26
separate substitute commands (the other two ways):
going through contortions with the hold space and without using 26
separate substitute commands (the other two ways):

        s/$/aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ/
        : more
                s/\([a-z]\)\(.*\1\)\(.\)/\3\2\3/
        t more
        s/aA[b-zB-Z]*$//

This will work correctly for any input.  One thing it might be suited
for is capitalizing the first letter of every word in the line.  In
fact, you could use the same tabl


--------------------------------------------------------------------------------

===================================== ======= =============================
To: Andru@romulus.ncsc.mil, Luvisi@romulus.ncsc.mil, luvisi@andru.sonoma.edu
Cc: Al Aab <af137@freenet.toronto.on.ca>
Subject: Re: seders newsletter # 2, 20 august 1996 (fwd)

> From: Andru Luvisi <luvisi@andru.sonoma.edu>
> To: Al Aab <af137@freenet.toronto.on.ca>
> Subject: Re: seders newsletter # 2, 20 august 1996
>
> > IN OTHER WORD :  BEAUTIFY THE INPUT FILE BY RMOVING ALL THE BLANKS BETWEEN
> > THE DOUBLE-QUOTE PAIR.  USE A GENERAL SED SCRIPT.
> > -------------------------------------------------------------------------
>
> here's my solution, dsq.sed:
>...
> s/\(.*\)\(\n\)\([^"]*\)/\1\3\2/
> s/[^\n]*\n\(.*\)\n.*/\1/
>...

Using a newline character as a marker is a good trick, since a newline
is the one character that can't occur in the line.  I believe [^\n] is
non-portable however, which is one drawback of using a newline vs some
other odd character (and you can't portably put a real newline inside
a character class either, alas).  I also use the \(\n\)...\1 trick
to avoid breaking the s/// across two lines.

> y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/

Here's a related tidbit that comes in handy at times.  It's a
lesser-known "third way" to translate case in sed, using a table
lookup technique.  It allows you to change just part of a line without
going through contortions with the hold space and without using 26
separate substitute commands (the other two ways):

        s/$/aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ/
        : more
                s/\([a-z]\)\(.*\1\)\(.\)/\3\2\3/
        t more
        s/aA[b-zB-Z]*$//

This will work correctly for any input.  One thing it might be suited
for is capitalizing the first letter of every word in the line.  In
fact, you could use the same table to up-shift the first letter of
every word and down-shift the remaining letters of the words (two
s/// commands inside the loop), which would be pretty hard to do
using y///, and would take 52 s/a/A/-type commands inside a loop
the other way.  The pattern above is set up to capitalize everything,
as it stands.

> >                       SED PROBLEM # 2
>
        s/$/aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ/
        : more
                s/\([a-z]\)\(.*\1\)\(.\)/\3\2\3/







# 950614, 950623, 951018, 951020
# Purpose:  Weeds (deletes) unwanted header lines from "folders"
#   (text files containing emails)
# Installation:  Save this script as "weedout.sed".
# Usage:  sed -f weedout.sed folder > folder.weeded
#
:again
/^Received:/{
  N
  s/^.*\n//
:blah
  /^[  ]/{
    N
    s/^.*\n//
    b blah
  }
  b again
}
#
# Comment the lines which you want to keep with a "#"
#
/^Approved:/d
/^Content-.*:/d
/^Distribution:/d
/^Errors-to:/d
/^Errors-To:/d
/^Full-Name:/d
/^In-Reply-To:/d
/^Lines:/d
/^Message-ID:/d
/^MIME-Version:/d
/^Message-Id:/d
/^Mime-Version:/d
/^NNTP-Posting-Host:/d
/^Organisation:/d
/^Organization:/d
/^Path:/d
/^Phone:/d
/^Post:/d
/^Precedence:/d
/^Received:/d

==========================================================================
I GUESS ONE OF THE IMPORTANT SOLUTIONS ABOVE WAS DUE TO MR UBBEN, THE
SED WIZARD, ALMOST AS WIZARDLY AS MR BRIAN HILES.

al aab, seders moderator                                      sed u soon 
               it is not zat we do not see the  s o l u t i o n          
               it is     zat we do not sed theproblem                     

========================== end of seders newsletter #6 =========================