Given following xml:
<Ergebnisse>
<Spiel>
<Datum>2013-10-02</Datum>
</Spiel>
<Spiel>
<Datum>2013-10-03</Datum>
</Spiel>
<Spiel>
<Datum>2013-10-03</Datum>
</Spiel>
<Spiel>
<Datum>2013-10-03</Datum>
</Spiel>
<Spiel>
<Datum>2013-10-06</Datum>
</Spiel>
<Spiel>
<Datum>2013-10-06</Datum>
</Spiel>
<Spiel>
<Datum>2013-10-06</Datum>
</Spiel>
<Spiel>
<Datum>2013-10-06</Datum>
</Spiel>
<Spiel>
<Datum>2014-05-01</Datum>
</Spiel>
<Spiel>
<Datum>2014-05-01</Datum>
</Spiel>
<Spiel>
<Datum>2014-04-27</Datum>
</Spiel>
</Ergebnisse>
Now I need to know, which is the highest date-value in "Datum". I'm using Python and lxml, so I can only work with Xpath 1.0.
I tried:
//Spiel[not (Datum < preceding::Spiel/Datum) and not (Datum < following::Spiel/Datum)]/Datum
but it returns all values. What can I do?
Thanks!
In XSLT 1.0 you can sort the values and take the last.
<xsl:for-each select="Speil/Datum">
<xsl:sort select="."/>
<xsl:if test="position()=last()"><xsl:value-of select="."/></xsl:if>
</xsl:for-each>
I wouldn't attempt it in XPath alone - if XPath 1.0 is all you have, return all the values and do the extraction in the Python host language.
Related
I got stuck over this problem for a while and although I found a diverse set of similar questions none exactly fitted my problem or solved the issue. So here is the deal:
I have an input.fasta , formatted like this:
>sp|O42363|APOA1_DANRE Apolipoprotein A-I OS=Danio rerio OX=7955 GN=apoa1 PE=2 SV=1
MKFVALALTLLLALGSQANLFQADAPTQLEHYKAAALVYLNQVKDQAEKALDNLDGTDYEQYKLQLSESLTKLQEYAQTTSQALTPYAETISTQLMENTKQLRERVMTDVEDLRSKLEPHRAELYTALQKHIDEYREKLEPVFQEYSALNRQNAEQLRAKLEPLMDDIRKAFESNIEETKSKVVPMVEAVRTKLTERLEDLRTMAAPYAEEYKEQLVKAVEEAREKIAPHTQDLQTRMEPYMENVRTTFAQMYETIAKAIQA
>sp|Q90260|ASL1B_DANRE Achaete-scute homolog 1b OS=Danio rerio OX=7955 GN=ascl1b PE=2 SV=1
MEATVVATTQLTQDSFYQPFSESLEKQDRECKVLKRQRSSSPELLRCKRRLTFNGLGYTIPQQQPMAVARRNERERNRVKQVNMGFQTLRQHVPNGAANKKMSKVETLRSAVEYIRALQQLLDEHDAVSAVLQCGVPSPSVSNAYSAGPESPHSAYSSDEGSYEHLSSEEQELLDFTTWFDRYESGASMATKDWC
>sp|Q6TH01|C10_DANRE Protein C10 OS=Danio rerio OX=7955 GN=si:dkey-29f10.1 PE=2 SV=1
MASAPAQQPTLTVEQARVVLSEVIQAFSVPENAARMEEARESACNDMGKMLQLVLPVATQIQQEVIKAYGFNNEGEGVLKFARLVKMYETQDPEIAAMSVKLKSLLLPPLSTPPIGSGIPTS
>sp|Q6PFL6|CCD43_DANRE Coiled-coil domain-containing protein 43 OS=Danio rerio OX=7955 GN=ccdc43 PE=2 SV=1
MAAPEQIAGEFENWLNERLDSLEVDREVYGAYILGVLQEEESDEEQKDALQGILSAFLEEETLEEVCQEILKQWTECCSRSGAKSNQADAEVQAIASLIEKQAQIVVKQKEVSEDAKKRKEAVLAQYANVTDDEDEAEEEEQVPVGIPSDKSLFKNTNVEDVLNRRKLQRDQAKEDAQKKKEQDKMQREKDKLSKQERKDKEKKRTQKGERKR
>sp|P0C7U5|C5AR1_DANRE C5a anaphylatoxin chemotactic receptor 1 OS=Danio rerio OX=7955 GN=c5ar1 PE=3 SV=1
MDDNNSDWTSYDFGNDTIPSPNEISLSHIGTRHWITLVCYGIVFLLGVPGNALVVWVTGFRMPNSVNAQWFLNLAIADLLCCLSLPILMVPLAQDQHWPFGALACKLFSGIFYMMMYCSVLLLVVISLDRFLLVTKPVWCQNNRQPRQARILCFIIWILGLLGSSPYFAHMEIQHHSETKTVCTGSYSSLGHAWAITIIRSFLFFLLPFLIICISHWKVYHMTSSGRRQRDKSSRTLRVILALVLGFFLCWTPLH
and an ids.txt list, formatted as this:
Q90260
Q6PFL6
I would like to extract all fasta sequences with their header for which the IDs of ids.txt are element of the header.
I have tried grep -w -A 2 -Ff id_list.txt input.fasta --no-group-separator > out.fasta but that did not work.
Ideally, I would like to express via regex to check if the string between the two | of each line starting with >sp matches any ID in my idx.txt. And if so, to store that header and fasta in out.fasta.
So that out.fasta would look like that:
>sp|Q90260|ASL1B_DANRE Achaete-scute homolog 1b OS=Danio rerio OX=7955 GN=ascl1b PE=2 SV=1
MEATVVATTQLTQDSFYQPFSESLEKQDRECKVLKRQRSSSPELLRCKRRLTFNGLGYTIPQQQPMAVARRNERERNRVKQVNMGFQTLRQHVPNGAANKKMSKVETLRSAVEYIRALQQLLDEHDAVSAVLQCGVPSPSVSNAYSAGPESPHSAYSSDEGSYEHLSSEEQELLDFTTWFDRYESGASMATKDWC
>sp|Q6PFL6|CCD43_DANRE Coiled-coil domain-containing protein 43 OS=Danio rerio OX=7955 GN=ccdc43 PE=2 SV=1
MAAPEQIAGEFENWLNERLDSLEVDREVYGAYILGVLQEEESDEEQKDALQGILSAFLEEETLEEVCQEILKQWTECCSRSGAKSNQADAEVQAIASLIEKQAQIVVKQKEVSEDAKKRKEAVLAQYANVTDDEDEAEEEEQVPVGIPSDKSLFKNTNVEDVLNRRKLQRDQAKEDAQKKKEQDKMQREKDKLSKQERKDKEKKRTQKGERKR
I am pretty sure this can be expressed via awk or grep but I am new to bash so I am having a hard time right now.
Thanks a lot in advance! :)
Using join and sort:
join -t \| -2 2 -o 2.1,2.2,2.3 <(sort ids.txt) <(sort -t \| -k2 input.fasta)
Assuming there is no extra | character in the input.fasta and the order of output lines isn't significant.
With awk:
awk -F'[|]' 'NR==FNR{ids[$0];next}$2 in ids' ids.txt input.fasta
Explanation:
I'm passing both files ids.txt and input.fasta as input files to awk. The order is important. -F'[|]' sets the input field delimiter to a |.
The script:
# NR is the overall record (line) number
# FNR is the record (line) number in the current input file
NR==FNR { # True as long as we are reading the first input file
ids[$0] # Create a key in ids for every id from ids.txt
next # Don't process further actions
}
# Because of the 'next' statement above, we'll reach this point only
# when reading the second input file (input.fasta)
$2 in ids # Print the current line if the second field
# was found in the ids lookup
Output:
>sp|Q90260|ASL1B_DANRE Achaete-scute homolog 1b OS=Danio rerio OX=7955 GN=ascl1b PE=2 SV=1
>sp|Q6PFL6|CCD43_DANRE Coiled-coil domain-containing protein 43 OS=Danio rerio OX=7955 GN=ccdc43 PE=2 SV=1
Update: It turned out that you also want to print the below the match. This can be achieved like this:
BEGIN {
FS="[|]"
}
# NR is the overall record (line) number
# FNR is the record (line) number in the current input file
NR==FNR { # True as long as we are reading the first input file
ids[$0] # Create a key in ids for every id from ids.txt
next # Don't process further actions
}
# Because of the 'next' statement above, we'll reach this point only
# when reading the second input file (input.fasta)
$2 in ids {
# set or reset a variable p to 2 if the second field
# was found in the ids lookup
p = 2
}
# Decrement the variable p on every iteration and check if it
# is greater than 0 after that. If that's true, awk will print
# the current line
p--> 0
Consider the following 'sample.xml'
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<root>
<level>
<name>testA</name>
<level>
<name>testB</name>
</level>
<level>
<name>testC</name>
<level>
<name>testD</name>
<level>
<name>testE</name>
</level>
</level>
</level>
</level>
</root>
Using xmlstarlet i can do:
xml sel -t -m //level -v name -o " " -v "count(ancestor::*)-1" -o "." -v "count(preceding-sibling::*)" -n sample.xml
This produces:
testA 0.0
testB 1.1
testC 1.2
testD 2.1
testE 3.1
What should i do to get:
testA 0.0
testB 1.1
testC 1.2
testD 1.2.1
testE 1.2.1.1
In this example i only have 4 levels, but this can be more than 4.
I am thinking of some kind of recursion, are there any links available which can explain how to do that?
You should be able to do this using XSLT with the "tr" command in xmlstarlet...
However your desired output is a little confusing. If "testA" is the first level and you start at zero, why don't all the other entries start at zero? Or maybe "root" is supposed to be zero?
Anyway, here's an example that starts at 1 instead of zero that should get you started...
XML Input (input.xml)
<root>
<level>
<name>testA</name>
<level>
<name>testB</name>
</level>
<level>
<name>testC</name>
<level>
<name>testD</name>
<level>
<name>testE</name>
</level>
</level>
</level>
</level>
</root>
XSLT 1.0 (test.xsl)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="level">
<xsl:value-of select="concat(name, ' ')"/>
<xsl:for-each select="ancestor-or-self::level">
<xsl:if test="not(position()=1)">.</xsl:if>
<xsl:number/>
</xsl:for-each>
<xsl:text>
</xsl:text>
<xsl:apply-templates select="level"/>
</xsl:template>
</xsl:stylesheet>
Command Line
xmlstarlet tr test.xsl input.xml
Output
testA 1
testB 1.1
testC 1.2
testD 1.2.1
testE 1.2.1.1
This problem can be solved without recursion, by iterating over
elements on the ancestor-or-self axis.
The following xmlstarlet command processes all level elements using
the inner -m (xsl:for-each) to handle each path from root to target
(as suggested in comments the shell variable base defaults to 0 but
can be set to 1).
xmlstarlet select -T -t \
-m '//level' \
-v 'concat(name," ")' \
-m 'ancestor-or-self::level' \
--if 'position() = 1' \
-v "'${base:-0}'" \
--else \
-o '.' \
-v 'count(preceding-sibling::level) + 1' \
-b \
-b \
-n \
file.xml
Output:
testA 0
testB 0.1
testC 0.2
testD 0.2.1
testE 0.2.1.1
For a more compact inner -m -- producing the same output -- instead
-m 'ancestor-or-self::level' \
--if 'position() != 1' -o '.' -b \
-v 'count(preceding-sibling::level) + number(position() != 1)' \
-b \
where the count is incremented by 1 for all except the root level
where position() is 1.
As a variation on the same theme: select elements with the shell
variable target and print their paths as XPath expressions using the
XSLT current() function to reference the
element being processed by the inner -m:
target='//level[name="testB" or name="testE"]'
xmlstarlet select -T -t \
-m "${target}" \
-m 'ancestor-or-self::*' \
--var pos='1 + count(preceding-sibling::*[name() = name(current())])' \
-v 'concat("/",name(),"[",$pos,"]")' \
-b \
-n \
file.xml
Output:
/root[1]/level[1]/level[1]
/root[1]/level[1]/level[2]/level[1]/level[1]
I have a very large KML file (over 20000 placemarkers). They are named by numbers which go up in increments of 5 starting at about 7000 up to 27000.
<Placemark>
<name>7750</name>
<description><![CDATA[converted by:</br>GridReferenceFinder.com</br>]]></description>
<Point>
<coordinates>-0.99153654,52.225002,0</coordinates>
</Point>
</Placemark>
I would like to remove any placemarker that doesnt end in 00 or 50. Having a placemarker every 5 metres is slowing down some of the lower end devices on site.
Is there some script, command or whatever that will check the name and if it doesn't end in 00 or 50 delete from <Placemark> to </Placemark> for that entry?
You would literally be saving me 10 hours work deleting them individually.
A Perl-one liner solution!
I would like to remove any placemarker that doesnt end in 00 or 50
First of all a solution for this; match anything except for ones end with 00 or 50
^(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d$
Demo:
^(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d$
A fest test can be:
perl -le 'print for grep{ /^(?:[7-9]\d|[1-2]\d\d)(?=00|50)\d\d$/ } 7000..27000'
then read the entire file once:
$/=undef;
then read all matches with a while loop:
while/<Placemark>\s*?<name>(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d.*?<\/Placemark>/sg
s flag is for reading as a single line or . can match newline, and g for global search
then print the match (S&):
perl -lne '$/=undef;print $& while/<Placemark>\s*?<name>(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d.*?<\/Placemark>/sg' file
pattern for match:
<Placemark>\s*?<name>(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d.*?<\/Placemark>
demo:
<Placemark>\s*?<name>(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d.*?<\/Placemark>
NOTE:
If you notice this part (?!00|50) it is an exclude matcher that by using a lookahead, you can make it opposite, that means:
^(?:[7-9]\d|[1-2]\d\d)(?=00|50)\d\d$
only matches things that end with 00 or 50.
So you can use this to switch between what you want and what you do not want.
print all patterns that does not end with 00 or 50
perl -lne '$/=undef;print $& while/<Placemark>\s*?<name>(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d.*?<\/Placemark>/sg' file
print all patterns that end with 00 or 50
perl -lne '$/=undef;print $& while/<Placemark>\s*?<name>(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d.*?<\/Placemark>/sg' file
How to Substitute
if you like, you can use operator: s/regex-match/substitute-string/
perl -pe '$/=undef;s/<Placemark>\s*?<name>(?:[7-9]\d|[1-2]\d\d)(?!00|50)\d\d.*?<\/Placemark>/==>DELETE<==/sg' file
test:
input:
before...
<Placemark>
<name>7700</name>
<description><![CDATA[converted by:</br>GridReferenceFinder.com</br>]]></description>
<Point>
<coordinates>-0.99153654,52.225002,0</coordinates>
</Point>
</Placemark>
after...
---------
before...
<Placemark>
<name>7701</name>
<description><![CDATA[converted by:</br>GridReferenceFinder.com</br>]]></description>
<Point>
<coordinates>-0.99153654,52.225002,0</coordinates>
</Point>
</Placemark>
after...
--------
before...
<Placemark>
<name>27650</name>
<description><![CDATA[converted by:</br>GridReferenceFinder.com</br>]]></description>
<Point>
<coordinates>-0.99153654,52.225002,0</coordinates>
</Point>
</Placemark>
after...
--------
before...
<Placemark>
<name>27651</name>
<description><![CDATA[converted by:</br>GridReferenceFinder.com</br>]]></description>
<Point>
<coordinates>-0.99153654,52.225002,0</coordinates>
</Point>
</Placemark>
after...
end.
the output:
before...
<Placemark>
<name>7700</name>
<description><![CDATA[converted by:</br>GridReferenceFinder.com</br>]]></description>
<Point>
<coordinates>-0.99153654,52.225002,0</coordinates>
</Point>
</Placemark>
after...
---------
before...
==>DELETE<==
after...
--------
before...
<Placemark>
<name>27650</name>
<description><![CDATA[converted by:</br>GridReferenceFinder.com</br>]]></description>
<Point>
<coordinates>-0.99153654,52.225002,0</coordinates>
</Point>
</Placemark>
after...
--------
before...
==>DELETE<==
after...
end.
NOTE.2:
you can use -i for edit-in-place
perl -i.bak -pe ' ... the rest of the script ...' file
It is better to use perl 5.22 or upper version
Something like this in awk:
$ awk '
/<Placemark>/ { d=""; b="" } # d is delete, b is buffer, reset both
{ b=b $0 (/<\/Placemark>/?"":ORS) } # gather data to d
!/[50]0</ && /<\/name>/ { d=1 } # if not 50 or 00 set del flag
/<\/Placemark>/ && d!=1 { print b } # print b if not marked delete
' file
It only works with well formed input, especially:
...
</Placemark>
<Placemark>
...
<name>1234</name>
...
not:
...
</Placemark><Placemark>
... <!-- or: -->
<name>1234
</name>
...
It's an ugly hack. Someone will probably set you up with something nicer but try it if it suits your needs.
awk '$0 == "<Placemark>" {cnt=cnt+1} {arry[cnt]=arry[cnt]$0"\n";if ($0 ~ /<name>/) {match($0,/[[:digit:]]+/);num=substr($0,RSTART,RLENGTH);numbs[num]=cnt}} END { for ( i in numbs ) {if ( substr(i,length(i)-1,length(i)) == "00" || substr(i,length(i)-1,length(i)) == "50") { print arry[numbs[i]] } } }' filename
An alternate awk solution is as above. We first set a counter for an array when $0 = . Then we set the array arry for each Placement element in the file. As we do this we also check for the the name index in the file. When we find this, we pattern match the number within (match function) and then use this to sets another array numbs that tracks the numbers against the counter for each placement element. We finally loop through each elements in numbs, checking the number to ensure it ends in 50 or 00. If it does, the arry index is printed.
I want to split a file with the following algorithm.
This CSV has a 3600 lines previously ordered by Name alpabetically ( sort -k2 -n file.csv )
Currently I can run this command to split the file in equal number of lines:
split -l ${MAX_NUMBER_OF_LINES} filename.csv ${new_file_pattern}.
But the original requirement is:
Split into chunks of ${MAX_NUMBER_OF_LINES} UNLESS no more records with the first letter of the column 2 exists.
For example:
if I have ${MAX_NUMBER_OF_LINES} = 3, I can split the file in chunk of 300 lines if no more occurrencies of the last first letter of the column are found.
If the LINE 301 has a record with "Arboreal Peaches" the script has to add to the current chunk no matter the ${MAX_NUMBER_OF_LINE} was already reach.
Is sort of confusing explanation.. I hope any of you can help me (I already spent 2 days in this algorithm)
UPDATE
${MAX_NUMBER_OF_LINES} = 3
Example CSV (with fewer lines for exaple purpose).
Split command reaches ${MAX_NUMBER_OF_LINES}, but the line 4 already has a record with the letter A
'Aberdeen Research", 'Los Angeles', 'California'
'Aplueyo Labs", 'Los Angeles', 'US'
'Acar Media Group", 'Los Angeles', 'US'
'Aberdeen Research", 'San Jose', 'US'
'Beethoven Inc", 'San Jose', 'US'
EXPECTED RESULT
Splitted Files
1
'Aberdeen Research", 'Los Angeles', 'California'
'Aplueyo Labs", 'Los Angeles', 'US'
'Acar Media Group", 'Los Angeles', 'US'
'Aberdeen Research", 'San Jose', 'US'
2
'Beethoven Inc", 'San Jose', 'US'
Something like this? In awk:
$ cat split.awk
BEGIN {if(max=="")
print "Invalid numer of lines"; exit # exit if no max
}
(a=substr($0,2,1)) && ++c>=max && prev!=a { # first letter to a, if count >= max
c=0 # and first letter changes
fc++ # reset count and filemask counter
}
{
print $0 > (mask==""?"x":mask) (fc==""?0:fc) # write to file default mask x
prev=a # remember previous first letter
}
Run it:
$ awk -v max=3 -v mask="file" -f split.awk file.csv
$ cat file0
'Aberdeen Research", 'Los Angeles', 'California'
'Aplueyo Labs", 'Los Angeles', 'US'
'Acar Media Group", 'Los Angeles', 'US'
'Aberdeen Research", 'San Jose', 'US'
$ cat file1
'Beethoven Inc", 'San Jose', 'US'
mask is the filename prefix or $new_file_pattern and max is $MAX_NUMBER_OF_LINES, ie. in the command line set -v max=$MAX_NUMBER_OF_LINES -v mask=$new_file_pattern.
My task is to transform xml data to csv (comma separator data).
I have problem with sorting in the output data.
Please look at my examples below.
Please provide any suggestions how to resolve this issue.
Thanks in advance!
INPUT XML DATA
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<ItemInfo>
<ItemNmb>Item1</ItemNmb>
<ItemText>Item 111</ItemText>
<ItemDetails>
<ItemDetailInfo>
<id>111</id>
<Text>Text 111</Text>
</ItemDetailInfo>
<ItemDetailInfo>
<id>555</id>
<Text>Text 555</Text>
</ItemDetailInfo>
</ItemDetails>
</ItemInfo>
<ItemInfo>
<ItemNmb>Item2</ItemNmb>
<ItemText>Item 222</ItemText>
<ItemDetails>
<ItemDetailInfo>
<id>555</id>
<Text>Text 555</Text>
</ItemDetailInfo>
<ItemDetailInfo>
<id>333</id>
<Text>Text 333</Text>
</ItemDetailInfo>
<ItemDetailInfo>
<id>222</id>
<Text>Text 222</Text>
</ItemDetailInfo>
</ItemDetails>
</ItemInfo>
<ItemInfo>
<ItemNmb>Item3</ItemNmb>
<ItemText>Item 333</ItemText>
<ItemDetails>
<ItemDetailInfo>
<id>999</id>
<Text>Text 999</Text>
</ItemDetailInfo>
</ItemDetails>
</ItemInfo>
</Root>
XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output method="text" encoding="UTF-8" indent="yes"/>
<xsl:param name="delim" select="';'"/>
<xsl:param name="break" select="'
'"/>
<xsl:template match="/">
<xsl:for-each select="/Root/ItemInfo">
<xsl:call-template name="itemtemp">
<xsl:with-param name="item" select="ItemNmb"/>
<xsl:with-param name="text" select="ItemText"/>
</xsl:call-template>
</xsl:for-each>
</xsl:template>
<xsl:template name="itemtemp">
<xsl:param name="item"/>
<xsl:param name="text"/>
<xsl:for-each select="ItemDetails/ItemDetailInfo">
<xsl:sort select="id" data-type="text" order="ascending"/>
<xsl:call-template name="copmitemtemp">
<xsl:with-param name="item" select="$item"/>
<xsl:with-param name="text" select="$text"/>
<xsl:with-param name="idsub" select="id"/>
<xsl:with-param name="textsub" select="Text"/>
</xsl:call-template>
</xsl:for-each>
</xsl:template>
<xsl:template name="copmitemtemp">
<xsl:param name="item"/>
<xsl:param name="text"/>
<xsl:param name="idsub"/>
<xsl:param name="textsub"/>
<xsl:value-of select="$idsub" disable-output-escaping="yes"/><xsl:value-of select="$delim"/>
<xsl:value-of select="$textsub" disable-output-escaping="yes"/><xsl:value-of select="$delim"/>
<xsl:value-of select="$item" disable-output-escaping="yes"/><xsl:value-of select="$delim"/>
<xsl:value-of select="$text" disable-output-escaping="yes"/><xsl:value-of select="$break"/>
</xsl:template>
</xsl:stylesheet>
OUTPUT DATA
111;Text 111;Item1;Item 111
555;Text 555;Item1;Item 111
222;Text 222;Item2;Item 222
333;Text 333;Item2;Item 222
555;Text 555;Item2;Item 222
999;Text 999;Item3;Item 333
EXPECTED RESULT (Is sorted by (id))
111;Text 111;Item1;Item 111
222;Text 222;Item2;Item 222
333;Text 333;Item2;Item 222
555;Text 555;Item1;Item 111
555;Text 555;Item2;Item 222
999;Text 999;Item3;Item 333
Looks like the sorting should be done by ItemInfo/ItemDetails/ItemDetailInfo/id therefor you need to iterate over ItemDetailInfo.
Try this slightly change version of your xslt.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output method="text" encoding="UTF-8" indent="yes"/>
<xsl:param name="delim" select="';'"/>
<xsl:param name="break" select="'
'"/>
<xsl:template match="/">
<xsl:for-each select="/Root/ItemInfo/ItemDetails/ItemDetailInfo">
<xsl:sort select="id" data-type="text" order="ascending"/>
<xsl:call-template name="copmitemtemp">
<xsl:with-param name="item" select="../../ItemNmb"/>
<xsl:with-param name="text" select="../../ItemText"/>
<xsl:with-param name="idsub" select="id"/>
<xsl:with-param name="textsub" select="Text"/>
</xsl:call-template>
</xsl:for-each>
</xsl:template>
<xsl:template name="copmitemtemp">
<xsl:param name="item"/>
<xsl:param name="text"/>
<xsl:param name="idsub"/>
<xsl:param name="textsub"/>
<xsl:value-of select="$idsub" disable-output-escaping="yes"/>
<xsl:value-of select="$delim"/>
<xsl:value-of select="$textsub" disable-output-escaping="yes"/>
<xsl:value-of select="$delim"/>
<xsl:value-of select="$item" disable-output-escaping="yes"/>
<xsl:value-of select="$delim"/>
<xsl:value-of select="$text" disable-output-escaping="yes"/>
<xsl:value-of select="$break"/>
</xsl:template>
</xsl:stylesheet>
Which will generate the following output
111;Text 111;Item1;Item 111
222;Text 222;Item2;Item 222
333;Text 333;Item2;Item 222
555;Text 555;Item1;Item 111
555;Text 555;Item2;Item 222
999;Text 999;Item3;Item 333