Zeilenumbruch entfernen wenn nicht ^## [Archiv] - linuxforen.de -- User helfen Usern

Archiv verlassen und diese Seite im Standarddesign anzeigen : Zeilenumbruch entfernen wenn nicht ^##

Huhn Hur Tu

17.06.13, 13:36

Hi zusammen,

ich habe eine grosse Datei mit SQL Statements, diese jedoch als Mehrzeiler und fuer arbeiten in der Konsole ist das recht unpraktisch. Deshalb moechte ich per Script die Statements in Einzeiler wandeln.

Bis zu der Stelle an der alle Zeilen die ihren Zeilenumbruch behalten sollen zwei fuehrende Rauten haben kam ich aber nicht.

Die Sattements liegen also in folgendem Format vor

##
## Text text text text
select j.job_id, j.aktion, j.bz_id, j.ts_init, j.gueltig_ab, datediff(dayofyear, uj.ts_init, getdate()) as 'day diff since ts_init', datediff(dayofyear, uj.lastchange, getdate()) as 'day diff since lastchange',
uj.job_id, uj.bz_id, uj.ts_init, uj.lastchange, us.*
from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK
where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500)
##
select count(uj.job_id)
from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK
where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500)
##
## Text text text
select j.job_id, j.aktion, j.bz_id, j.ts_init, j.gueltig_ab, datediff(dayofyear, uj.ts_init, getdate()) as 'day diff since ts_init', datediff(dayofyear, uj.lastchange, getdate()) as 'day diff since lastchange',
uj.job_id, uj.bz_id, uj.ts_init, uj.lastchange, us.*
from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK
where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500) order by datediff(dayofyear, uj.ts_init, getdate()) desc

Diese sollen nun ueberall den Zeilenumbruch verlieren wo keine fuehrenden Rauten sind.

Ziel soll sein

##
## Text text text
select ......
##
## Text text text
select ......

Und um der Frage vorzubeugen, die datei mit den Statements wird regelaessig geaendert und es sollder Pflegeaufwand reduziert werden.

Alle Versuche mit sed und tr waren jetzt nicht wirklich so prall.

Gruss Stefan

nopes

17.06.13, 13:41

Kurz aus dem Bauch würde ich diese Regex (Perl) nehmen:
$text=~s/\n([^#]{2})/$1/g; Nachtrag ggf. kann {2} auch entfallen, wenn zB nur ein ";" am Ende steht, ansonsten muss eine Zeile aus mindestens 2 Zeichen bestehen.

Huhn Hur Tu

17.06.13, 14:01

Hi Nopes, danke, leider bin ich in Perl ueberhaupt nicht firm, deswegen habe ich wohl Probleme damit.

Ich habe hier eine Kettensed und will noch den Teil mit dem Zeilenumbruch anhaengen, hast du einen Tip:

cat ~/svn/SwisApps/ops/statements_tcs.sql | sed 's/$/ /' | sed 's/^ /--/g' | sed 's/--/## /g' | sed '/^[^##]/ s/ $//g' | Entferne Zeilenumbrch hier

Ich denke es liegt daran dass das Perl hier nicht an die Daten kommt.

Gruss Stefan

nopes

17.06.13, 14:09

Sollte mit der genannten Regex laufen, also
...|sed 's/^([^#].*)\n([^#])/\1 \2/g'

Kurze Erklärung: [^#] bedeutet ein Zeichen das nicht '#' entspricht (negative Zeichenklasse), wenn also nach einem Zeilenumbruch (\n) ein Zeichen != '#' folgt, wird der Zeilenumbruch entfernt; {2} habe ich wegelassen damit sowas:
## bla
sql
...
;
## bla zu
## bla
sql ... ;
## bla wird, das ' ' vor \2 stellt sicher das nicht versehentlich was vereinwird, bsp
## bla
select
*
...[QUOTE] würde könnte mit '\1' zu [QUOTE]##bla
select*... werden.

Und noch ein Nachtrag, bin mir gerade nicht sicher, ob sed direkt mehrere Zeilen verarbeitet, wenn nicht könnte du sowas anhängen:
perl -p -e '/^([^#].*)\n([^#])/$1 $2/gs' file daher ist mein erster Ausdruck auch falsch, da er das SQL mit ins Kommentar ziehen würde!

Also, das ist das richtige Muser: /^([^#].*)\n([^#])/$1 $2/

Huhn Hur Tu

17.06.13, 14:37

erst mal nachdenken wie das tun soll

nopes

17.06.13, 15:08

Der Ausdruck sucht Zeilen die nicht mit einem # beginnen, wenn die nächste Zeile nicht mit einem # beginnt, wird der Zeilenumbruch entfernt.

Aqualung

17.06.13, 18:11

Huhn Hur Tu

18.06.13, 13:08

Hi Nopes,

leider klappt das noch nicht so

/tmp$ cat ~/svn/SwisApps/ops/cancellation/SQL/monitor_statements_tcs.sql | sed 's/$/ /' | sed 's/^ /--/g' | sed 's/--/## /g' | sed '/^[^##]/ s/ $//g' | perl -p -e '/^([^#].*)\n([^#])/$1 $2/gs'
Scalar found where operator expected at -e line 1, near "/^([^#].*)\n([^#])/$1"
(Missing operator before $1?)
Scalar found where operator expected at -e line 1, near "$1 $2"
(Missing operator before $2?)
syntax error at -e line 1, near "/^([^#].*)\n([^#])/$1 "
Execution of -e aborted due to compilation errors.

und mit sed

~/tmp$ cat ~/svn/SwisApps/ops/cancellation/SQL/monitor_statements_tcs.sql | sed 's/$/ /' | sed 's/^ /--/g' | sed 's/--/## /g' | sed '/^[^##]/ s/ $//g' | sed 's/^([^#].*)\n([^#])/\1 \2/g'
sed: -e Ausdruck #1, Zeichen 27: Ungültiger Verweis \2 im rechten Teil (`RHS') des `s'-Befehls

Und an Aqualung, der tr macht mir aus dem File eine Zeile.

Gruss Stefan

Aqualung

18.06.13, 17:22

Du kannst den Umbruch in beliebige Zeichen Umwandeln, z.B Leerzeichen

tr "\012" " "

karl-heinz-lnx

18.06.13, 19:50

Hi Nopes,

leider klappt das noch nicht so

/tmp$ cat ~/svn/SwisApps/ops/cancellation/SQL/monitor_statements_tcs.sql | sed 's/$/ /' | sed 's/^ /--/g' | sed 's/--/## /g' | sed '/^[^##]/ s/ $//g' | perl -p -e '/^([^#].*)\n([^#])/$1 $2/gs'
Scalar found where operator expected at -e line 1, near "/^([^#].*)\n([^#])/$1"
(Missing operator before $1?)
Scalar found where operator expected at -e line 1, near "$1 $2"
(Missing operator before $2?)
syntax error at -e line 1, near "/^([^#].*)\n([^#])/$1 "
Execution of -e aborted due to compilation errors.

Die Fehlermeldung kommt vom Perlaufruf.
Der Aufuf
perl -p -e '/^([^#].*)\n([^#])/$1 $2/gs' ist m. E. auch falsch. Du willst in Perl etwas ersetzen, da erhört das "s" (für substitute) nach vorne also:

perl -p -e 's/^([^#].*)\n([^#])/$1 $2/g'

Ohne "s" vor dem RegEx hat Perl ein Problem, da sucht Perl nur.

Bespiel:

/RegExp/g
sucht nur nach der RegEx

s/RegExp/Regular Expression/g
ersetzt den Ausdruck.
Das "s" am Ende hat keine Relevanz. Die Fehlermeldung ist daher leicht erklärbar.

nopes

19.06.13, 09:52

Hi,

sorry da habe ich dich auf eine falsche Fährt gelockt, das Problem mit dem -p ist, dass Perl eine Schleife baut, die die Datei Zeilenweise durchgeht, also gleiches Problem, wie bei sed. Da ich mich irgendwie verantwortlich fühle, hier ein kleiner Skript, der genau das tut, was du willst.

#!/usr/bin/perl
use strict;
sub readFile($) {
my $fileName = shift;
my $cont = "";
open (IN,"<", $fileName) or die "Error reading '$fileName'!\n$!\n";
$cont = join("", <IN>);
close IN;
return $cont;
}
sub writeFile($$) {
my $fileName = shift;
my $cont = shift;
open (OUT,">", $fileName) or die "Error writing '$fileName'\n$!\n";
print OUT $cont;
close OUT;
}
################################################## #################################################
my $fileI = shift or die "no input file!\n";
my $fileO = shift or die "no output file!\n";
my $sqlI = readFile($fileI);
my $sqlO = "";
my $b = 0;
for my $line (split(/\n/, $sqlI)) {
if ($line=~/^#/) {
$sqlO .= $b ? "\n$line\n" : "$line\n";
$b = 0;
}
else {
$sqlO .= " $line";
$b = 1;
}
}
writeFile($fileO, $sqlO);Macht aus
##
## Text text text text
select j.job_id, j.aktion, j.bz_id, j.ts_init, j.gueltig_ab, datediff(dayofyear, uj.ts_init, getdate()) as 'day diff since ts_init', datediff(dayofyear, uj.lastchange, getdate()) as 'day diff since lastchange',
uj.job_id, uj.bz_id, uj.ts_init, uj.lastchange, us.*
from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK
where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500)
##
select count(uj.job_id)
from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK
where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500) das hier
##
## Text text text text
select j.job_id, j.aktion, j.bz_id, j.ts_init, j.gueltig_ab, datediff(dayofyear, uj.ts_init, getdate()) as 'day diff since ts_init', datediff(dayofyear, uj.lastchange, getdate()) as 'day diff since lastchange', uj.job_id, uj.bz_id, uj.ts_init, uj.lastchange, us.* from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500)
##
select count(uj.job_id) from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500)

buzz768

19.06.13, 10:58

sed -n '/^#/!{x;/^#/{p;ba};G;s/\n *//;h;ba};x;/^$/!p;:a;$p'

oder

sed -n '1{h;d};x;/^#/{p;d};x;/^#/{x;p;d};H;g;s/\n *//;x;${x;p}'

karl-heinz-lnx

19.06.13, 15:12

nopes

19.06.13, 18:50

Hi,

naja so falsch ist die Syntax nicht, "$var=~s/(.*\n)a/$1/gs" ist gültig, hier bewirk das s am Ende, das der . Punkt auch ein \n enthält, die suche kann so auf mehrere Zeilen angewandt werden, was ursprünglich mein Ziel war. Bei den Skript habe ich split gewählt, weil es einfach schneller als regex ist.

Da ich aber nur eine halber Profi in Perl bin (was wie ich finde, eine ziemlich geile Sprache ist), wirst du wohl recht haben...

Huhn Hur Tu

09.07.13, 12:29

So Leute,
nachdem ich heute etwas Muse hatte und mit dem Perl kram nicht zurecht kam, habe ich folgende gebaut

cat ~File.sql | sed 's/--/##/g' | sed 's/^$/||/g' | tr '\n' ' ' | tr '||' '\n'

Das tut genau was ich will.

Danke an alle fuer die Anregungen.

Gruss Stefan

buzz768

09.07.13, 12:38

Hmm..

$ cat File.sql | sed 's/--/##/g' | sed 's/^$/||/g' | tr '\n' ' ' | tr '||' '\n'
## ## Text text text text select j.job_id, j.aktion, j.bz_id, j.ts_init, j.gueltig_ab, datediff(dayofyear, uj.ts_init, getdate()) as 'day diff since ts_init', datediff(dayofyear, uj.lastchange, getdate()) as 'day diff since lastchange', uj.job_id, uj.bz_id, uj.ts_init, uj.lastchange, us.* from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500) ## select count(uj.job_id) from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500) ## ## Text text text select j.job_id, j.aktion, j.bz_id, j.ts_init, j.gueltig_ab, datediff(dayofyear, uj.ts_init, getdate()) as 'day diff since ts_init', datediff(dayofyear, uj.lastchange, getdate()) as 'day diff since lastchange', uj.job_id, uj.bz_id, uj.ts_init, uj.lastchange, us.* from job uj NOHOLDLOCK, job j NOHOLDLOCK, subjob us NOHOLDLOCK where uj.aktion=7 and uj.bz_id not in (20000, 40000, 89795) and us.job_id=uj.job_id and j.job_id=uj.ref_job_id and j.bz_id in(11299, 11399, 11499, 11500) order by datediff(dayofyear, uj.ts_init, getdate()) desc

Huhn Hur Tu

10.07.13, 08:12

So sieht die Quellinformation aus

--## Normale Jobs Anbahnung --
select count(job_id) as anzahlJobs from job (index job_I2 mru) noholdlock
where (aktion = 1 and bz_id in (11000, 11001, 11002, 11003, 11009, 11010, 11011, 11012, 11014, 11498))
and datediff(ss, lastchange, getdate()) > 43200

select * from job (index job_I2 mru) noholdlock
where (aktion = 1 and bz_id in (11000, 11001, 11002, 11003, 11009, 11010, 11011, 11012, 11014, 11498))
and datediff(ss, lastchange, getdate()) > 43200

--## Normale Jobs Ausführung --
select count(job_id) as anzahlJobs from job (index job_I2 mru) noholdlock
where (aktion = 1 and bz_id in (11550, 11660, 11700, 11702, 11707, 11712, 11721, 11722, 11725, 11755))
and datediff(ss, lastchange, getdate()) > 43200

select * from job (index job_I2 mru) noholdlock
where (aktion = 1 and bz_id in (11550, 11660, 11700, 11702, 11707, 11712, 11721, 11722, 11725, 11755))
and datediff(ss, lastchange, getdate()) > 43200

Das Ergebniss ist

#### Normale Jobs Anbahnung ## select count(job_id) as anzahlJobs from job (index job_I2 mru) noholdlock where (aktion = 1 and bz_id in (11000, 11001, 11002, 11003, 11009, 11010, 11011, 11012, 11014, 11498)) and datediff(ss, lastchange, getdate()) > 43200

select * from job (index job_I2 mru) noholdlock where (aktion = 1 and bz_id in (11000, 11001, 11002, 11003, 11009, 11010, 11011, 11012, 11014, 11498)) and datediff(ss, lastchange, getdate()) > 43200

#### Normale Jobs Ausführung ## select count(job_id) as anzahlJobs from job (index job_I2 mru) noholdlock where (aktion = 1 and bz_id in (11550, 11660, 11700, 11702, 11707, 11712, 11721, 11722, 11725, 11755)) and datediff(ss, lastchange, getdate()) > 43200

select * from job (index job_I2 mru) noholdlock where (aktion = 1 and bz_id in (11550, 11660, 11700, 11702, 11707, 11712, 11721, 11722, 11725, 11755)) and datediff(ss, lastchange, getdate()) > 43200

Gruss Stefan

Huhn Hur Tu

10.07.13, 08:15

so wirds noch ein wenig huebscher

sed 's/--/##/g' | sed 's/^$/||/g' | tr '\n' ' ' | tr '||' '\n' | tr '##' '\n'

Gruss Stefan

buzz768

10.07.13, 09:30

Okay.
Ich habe den sed-Ausdruck noch einmal überarbeitet:

# Zeilen zusammenfügen, die nicht mit einem '#' beginnen und nicht leer sind
sed -n '1{h;be};x;/^[^#]\+/!{p;be};x;/^[^#]\+/!{x;p;be};H;g;s/\n/ /;x;:e;${x;p}'

Huhn Hur Tu

10.07.13, 10:45

Hauehaueha,

danke, da hab ich ja etwas lektuere fuer den Urlaub,

Danke

Stefan