to extract data between two words for the first occurence of a xml file in unix - bash

How to extract data between "so" and "again" ( the first occurence of test )
cat > sedtesting.txt
this is for testing
so test
AAgainn and again
this is for testing
so test
AAgainn and again
expected output is :
so test
AAgainn and again
but what i am getting is :
so test
AAgainn and again
so test
AAgainn and again
in the below sample code we need to extract data between "Exp_CDL_CONTRACT_D" and "Tracing Level"
below sample code
<TRANSFORMATION DESCRIPTION ="" NAME ="Exp_CDL_CONTRACT_D" OBJECTVERSION ="1" REUSABLE ="NO" TYPE ="Expression" VERSIONNUMBER ="15">
<TRANSFORMFIELD DATATYPE ="string" DEFAULTVALUE ="&apos;UNKNOWN&apos;" DESCRIPTION ="" EXPRESSION ="CONTRACT_NUM" EXPRESSIONTYPE ="GENERAL" NAME ="CONTRACT_NUM" PICTURETEXT ="" PORTTYPE ="INPUT/OUTPUT" PRECISION ="120" SCALE ="0"/>
<TRANSFORMFIELD DATATYPE ="string" DEFAULTVALUE ="-999" DESCRIPTION ="" EXPRESSION ="MASTER_AGREEMENT_NUM" EXPRESSIONTYPE ="GENERAL" NAME ="MASTER_AGREEMENT_NUM" PICTURETEXT ="" PORTTYPE ="INPUT/OUTPUT" PRECISION ="255" SCALE ="0"/>
<TRANSFORMFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" EXPRESSION ="DEAL_NUM" EXPRESSIONTYPE ="GENERAL" NAME ="DEAL_NUM" PICTURETEXT ="" PORTTYPE ="INPUT/OUTPUT" PRECISION ="50" SCALE ="0"/>
<TRANSFORMFIELD DATATYPE ="date/time" DEFAULTVALUE ="" DESCRIPTION ="" EXPRESSION ="FUNDING_DT" EXPRESSIONTYPE ="GENERAL" NAME ="FUNDING_DT" PICTURETEXT ="" PORTTYPE ="INPUT/OUTPUT" PRECISION ="29" SCALE ="9"/>
<TRANSFORMFIELD DATATYPE ="date/time" DEFAULTVALUE ="TO_DATE(&apos;1/1/1900 00:00:00 &apos;,&apos;MM/DD/YYYY HH24:MI:SS&apos;)" DESCRIPTION ="" EXPRESSION ="BOOK_DT" EXPRESSIONTYPE ="GENERAL" NAME ="BOOK_DT" PICTURETEXT ="" PORTTYPE ="INPUT/OUTPUT" PRECISION ="29" SCALE ="9"/>
<TABLEATTRIBUTE NAME ="Tracing Level" VALUE ="Normal"/>
<TRANSFORMATION DESCRIPTION ="" NAME ="Exp_SEQ_CDL_CONTRACT_D" OBJECTVERSION ="1" REUSABLE ="NO" TYPE ="Expression" VERSIONNUMBER ="8">
<TRANSFORMFIELD DATATYPE ="decimal" DEFAULTVALUE ="" DESCRIPTION ="" EXPRESSION ="V_CNT+1" EXPRESSIONTYPE ="GENERAL" NAME ="V_CNT" PICTURETEXT ="" PORTTYPE ="LOCAL VARIABLE" PRECISION ="38" SCALE ="0"/>
<TRANSFORMFIELD DATATYPE ="decimal" DEFAULTVALUE ="" DESCRIPTION ="" EXPRESSION ="IIF(V_CNT=1,:SP.GET_MAX_VALUE(&apos;CILDL.CDL_CONTRACT_D&apos;,&apos;CONTRACT_KEY&apos;),V_MAX)" EXPRESSIONTYPE ="GENERAL" NAME ="V_MAX" PICTURETEXT ="" PORTTYPE ="LOCAL VARIABLE" PRECISION ="38" SCALE ="0"/>
<TRANSFORMFIELD DATATYPE ="decimal" DEFAULTVALUE ="ERROR(&apos;transformation error&apos;)" DESCRIPTION ="" EXPRESSION ="V_CNT+V_MAX" EXPRESSIONTYPE ="GENERAL" NAME ="CONTRACT_KEY" PICTURETEXT ="" PORTTYPE ="OUTPUT" PRECISION ="38" SCALE ="0"/>
<TRANSFORMFIELD DATATYPE ="decimal" DEFAULTVALUE ="" DESCRIPTION ="" EXPRESSION ="Lkp_CONTRACT_KEY" EXPRESSIONTYPE ="GENERAL" NAME =
<TABLEATTRIBUTE NAME ="Tracing Level" VALUE ="Normal"/>
<INSTANCE DESCRIPTION ="" INSTANCEID ="16" NAME ="Exp_CDL_CONTRACT_D" REUSABLE ="NO" TRANSFORMATION_NAME ="Exp_CDL_CONTRACT_D" TRANSFORMATION_TYPE ="Expression" TYPE ="TRANSFORMATION"/>
<INSTANCE DESCRIPTION ="" INSTANCEID ="17" NAME ="Lkp_CDL_CONTRACT_D" REUSABLE ="NO" TRANSFORMATION_NAME ="Lkp_CDL_CONTRACT_D" TRANSFORMATION_TYPE ="Lookup Procedure" TYPE ="TRANSFORMATION"/>
<INSTANCE DESCRIPTION ="" INSTANCEID ="18" NAME ="Rtr_CDL_CONTRACT_D" REUSABLE ="NO" TRANSFORMATION_NAME ="Rtr_CDL_CONTRACT_D"
<MAPPINGVARIABLE DATATYPE ="date/time" DEFAULTVALUE ="" DESCRIPTION ="" ISEXPRESSIONVARIABLE ="NO" ISPARAM ="YES" NAME ="$$LAST_EXTRACT_DATE" PRECISION ="29" SCALE ="9" USERDEFINED ="YES"/>
</WORKFLOW>
</FOLDER>
</REPOSITORY>
</POWERMART>

Use awk:
awk -F 'Exp_CDL_CONTRACT_D|Tracing Level' '{print "\"Exp_CDL_CONTRACT_D" $2 "Tracing Level\""; exit}' RS= file.xml
OR grep -oP:
grep -oP '"Exp_CDL_CONTRACT_D[\s\S]*Tracing Level"' file.xml

Related

INI file - retrieve a section name by key name in VBS

I would like to retrieve a section name from an INI file with only a unique key name
My ini file :
...
[Area.104]
Title=Central North America
Local=Scenery\NAMC
Layer=104
Active=TRUE
Required=FALSE
[Area.105]
Title=Eastern North America
Local=Scenery\NAME
Layer=105
Active=TRUE
Required=FALSE
[Area.106]
Title=Western North America
Local=Scenery\NAMW
Layer=106
Active=TRUE
Required=FALSE
...
How can I get section name [Area.105] from unique key Title=Eastern North America ???
Thank you
I have two ways of finding the required Area code:
METHOD 1
Option Explicit
Dim strFilePath, ofso, ofile, strFileData, strKey, strPrev, strCurr
strFilePath="" '<-- Enter the absolute path of your .ini file in this variable
Set ofso = CreateObject("scripting.FileSystemObject")
Set ofile = ofso.OpenTextFile(strFilePath,1,False)
strKey = "Eastern North America" '<-- Enter Unique title for which you want the Area code
strPrev=""
strCurr=""
Do
strCurr = ofile.ReadLine
If InStr(1,strCurr,strKey)<>0 Then
Exit Do
End If
strPrev = strCurr
Loop Until ofile.AtEndOfStream
MsgBox strPrev
Set ofile = Nothing
Set ofso = Nothing
METHOD 2(Using Regular Expression)
Option Explicit
Dim strFilePath, ofso, ofile, strFileData, strKey, re, objMatches
strFilePath="" '<-- Enter the absolute path of your .ini file in this variable
Set ofso = CreateObject("scripting.FileSystemObject")
Set ofile = ofso.OpenTextFile(strFilePath,1,False)
strFileData = ofile.ReadAll()
ofile.Close
strKey = "Eastern North America" '<-- Enter Unique title for which you want the Area code
Set re = New RegExp
re.Global=True
re.Pattern="\[([^]]+)]\s*Title="&strKey
Set objMatches = re.Execute(strFileData)
If objMatches.Count>0 Then
MsgBox objMatches.Item(0).Submatches.Item(0)
End If
Set re = Nothing
Set ofile = Nothing
Set ofso = Nothing
>>>Click here for Regex Demo<<<
Regex Explanation:
\[ - matches literal [
([^]]+) - capture 1+ occurrence of any character which is not ] in a group
] - matches literal ]
\s* - matches 0+ white-spaces(which include the newline characters)
Title= - matches the text Title=. This is then concatenated with the variable strKey containing the value of unique title.

VBS Get a section name from an INI file with only a unique key name with ADODB.Stream

I would like to find a section name from an INI file with only a unique key name using ADODB.Stream instead of scripting.FileSystemObject with Charset "_autodetect_all"
My ini file :
...
...
...
[Area.104]
Title=Central North America
Local=Scenery\NAMC Layer=104
Active=TRUE
Required=FALSE
[Area.105]
Local=Scenery\NAME
Layer=105
Active=TRUE
Required=FALSE
Title=Eastern North America
[Area.106]
Local=Scenery\NAMW
Layer=106
Title=Western North America
Active=TRUE
Required=FALSE
...
...
...
How can I get section name [Area.105] from unique key Title=Eastern North
America ??? Keys are in random order. Thanks
Here is the answer which I have got from another website (thank you very much omen999)
This code works perfectly with ADODB
Dim TitleName
TitleName = Array("Central North America")
Set IniStream=CreateObject("ADODB.Stream")
IniStream.Open
Inistream.Charset="_autodetect_all"
IniStream.LoadFromFile "Area.ini"
IniFile=IniStream.ReadText
PosEnd=InStrRev(IniFile,"]",InStrRev(IniFile,TitleName(0)))
PosStart=InStrRev(IniFile,"[",PosEnd)+1
Wscript.Echo Mid(IniFile,PosStart,PosEnd-PosStart)
IniStream.Close

Change the format of date from "mm/dd/yyyy" to "Month dd, yyyy" in Ruby

I am trying to extract date from XML and compare it with the date in a PDF.
I am using Nokogiri to get the date from XML and PDF-Reader to read the date from PDF.
But the date in XML is in "mm/dd/yyyy" format and the date in PDF is in "Month dd, yyyy" format.
XML Tag:
<LetterSendDate>02/29/2016</LetterSendDate>
Extracting the Date from xml using Nokogiri:
#reader = file('C:\Users\ecz560\Desktop\30004_Standard.pdf').parse_pdf
#xml = file('C:\Users\ecz560\Desktop\30004_Standard.xml').parse_xmlDoc
#LettersendDate = #xml.xpath("//Customer[RTLtr_Loancust='0163426']//RTLtr_LetterSendDate").map(&:text)
Comparing the XML date with the date in PDF:
page_index = 0
#reader.pages.each do |page|
page_index = page_index+1
if expect(page.text).to include #LettersendDate
valid_text = "Given text is present in -- #{page_index}"
puts valid_text
end
end
but expect(page.text) returns February 29, 2016
so it is giving me error while comparing
Error
if expect(page.text).to include #LettersendDate
TypeError: no implicit conversion of String into Array
How can I convert the date from "mm/dd/yy" format to "Month dd, yyyy format" ?

Xquery does not written a result

I have the following Xquery
select email1
from customers,
XMLTABLE(
'$customer/customerinfo/contacts/phone[#type="work"]'
PASSING object_value as "customer"
columns
email1 varchar2(60) path '/emails/email1'
) as x
EMAIL1
------------------------------------------------------------
1 row selected.
When executed on a table of customers of xmltype stored in oracle 12c i do not get any result but a blank .
The xml itself looks something like this
<customerinfo xmlns:ns0="http://posample.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Cid="1000">
<name>Kathy Smith</name>
<addr country="Canada">
<street>5 Rosewood</street>
<city>Toronto</city>
<prov-state>Ontario</prov-state>
<pcode-zip>M6W 1E6</pcode-zip>
</addr>
<contacts>
<phone type="work">416-555-1358</phone>
<emails>
<email1>kathy#stackoverflow.org</email1>
<email2>kathy#stackover.org</email2>
</emails>
<phone type="personal">416-555-1358</phone>
<emails>
<email1>kathy#stackoverflow.org</email1>
<email2>kathy#stackover.org</email2>
</emails>
</contacts>
</customerinfo>
1.I want the output to be kathy#stackoverflow.org.
<emails1/> is no children of <phone/>. This XML format is a little bit broken, as you cannot directly select any "work" email address.
An XPath expression which only matches the first <email1/> node after the "work phone" would be
/customerinfo/contacts/phone[#type="work"]/following-sibling::email1[1]

Oracle Clob holds complex XML; how to select specific data with Xquery

I'm trying to extract specific data from a complex XML data set stored in a CLOB field in a commercial app. I cannot change the XML format (namespace, etc), I cannot change the CLOB to XMLType.
The xml data looks like:
<?xml version="1.0" encoding="utf-8"?>
<Calculation>
<ProcessUnitModelScenario Id="1265319" EntityId="10030" EntityName="Chaco Plant" ProcessUnitId="10225" ProcessUnitName="Turbine - Unit 37" EmissionModelId="10000" EmissionModelName="Emissions" ScenarioId="10053" ScenarioName="GHG_Comb_Run_Time" EffectiveDate="1/1/2012 12:00:00 AM" EndDate="2/1/2012 12:00:00 AM" ActiveDate="1/1/2008 12:00:00 AM" ProductionUnitId="10031" ProductionUnitName="Default Production Unit - Month" ProductionScheduleId="13541" OperatingPercentage="100" LinkLevel="1">
<EmissionModel Id="10935" EffectiveDate="1/1/2012 12:00:00 AM" EndDate="2/1/2012 12:00:00 AM">
<EmissionModelMaterial Id="13250" OutputType="Air Emissions" OutputTypeId="1" Media="Vapor" MediaName="Air" MaterialId="83" EquationId="10096" EquationName="GHG Combustion: Run time" EquationUnit="lb/hr" EquationUnitName="lb/hr" EquationBaseUnit="lb/hr" EquationBaseUnitName="lb/hr" SpeciationOption="StandardSpeciation" SpeciationOptionName="Standard Speciation" UseComponentVaporPressureMethods="False" VaporPressureOptionName="Material's vapor pressure methods">
<Material Id="83" Name="Methane" EffectiveId="10082" EffectiveDate="1/1/1990 12:00:00 AM" ComponentBasis="Vapor" MolecularWeight="16.043" LiquidDensity="1.34687732957939" VaporPressureMethod="Riedels" RiedelA="39.205" RiedelB="-1324.4" RiedelC="-3.4366" RiedelD="3.1E-05" RiedelE="2" UseDefinedComposition="False">
<CalculationPeriod StartDate="1/1/2012 12:00:00 AM" EndDate="2/1/2012 12:00:00 AM">
<EquationVariable Id="11079" Name="HeatRating" Order="10" BaseUnit="BTU/sec" EquationUnit="MMBtu/hr" Type="System" TypeName="System Variable" SystemCalculationType="ProcessUnitProperty" SystemCalculationName="Process Unit Property" SystemParameterProcessPropertyId="10005" SystemParameterModelOutputTypeId="1" TimeDependent="False" Value="116" EnteredValue="116" EnteredUnit="MMBtu/hr" />
<EquationVariable Id="11077" Name="GHGEF" Order="20" BaseUnit="lb/BTU" EquationUnit="kg/MMBTU" Type="GlobalEmissionFactor" TypeName="Global Emission Factor" TimeDependent="True" Value="0.001" EnteredValue="0.001" EnteredUnit="kg/MMBTU" />
<EquationVariable Id="11078" Name="RunHrs" Order="30" BaseUnit="hr" Type="Parameter" TypeName="Parameter" ParameterLevel="ProcessUnit" ParameterLevelName="Process Unit" ParameterId="10044" ParameterName="RunHrs - " TimeDependent="True" Value="612" EnteredValue="612" EnteredUnit="hr" />
<EquationVariable Id="11080" Name="kgtolb" Order="40" BaseUnit="lb" Type="GlobalConstant" TypeName="Global Constant" GlobalConstantId="10007" TimeDependent="False" Value="2.20462" />
<EquationVariable Id="11081" Name="OpHrs" Order="45" BaseUnit="hr" EquationUnit="hr" Type="System" TypeName="System Variable" SystemCalculationType="OperatingHours" SystemCalculationName="Operating Hours" TimeDependent="True" Value="744" />
<EquationVariable Id="11082" Name="EmissionRate" Order="46" BaseUnit="lb/hr" Type="FinalResult" TypeName="Final Expression" Formula="(HeatRating*GHGEF)*RunHrs*kgtolb/OpHrs" TimeDependent="True" Value="0.210363418064516" />
<Emission EffectiveDate="1/1/2012 12:00:00 AM" EndDate="2/1/2012 12:00:00 AM" BaseUnit="lb/hr" BaseUnitName="lb/hr" EmissionAmount="0.210363418064516" Unit="lb/hr" UnitName="lb/hr" ResultValue="0.210363418064516" LinkType="Unabated" LinkTypeName="" OperatingHours="744" EmissionMass="156.51038304" EmissionMassUnit="lb" MaterialId="83" EffectiveMaterialId="10082" MaterialName="Methane" MaterialEffectiveDate="1/1/1990 12:00:00 AM" />
</CalculationPeriod>
</Material>
<Material etc...>
</Material>
</EmissionModelMaterial>
<EmissionModelMaterial etc...>
</EmissionModelMaterial>
</EmissionModel>
<EmissionModel etc...>
</EmissionModel>
<ProcessUnitModelScenario etc...>
</ProcessUnitModelScenario>
</Calculation>
My need is to return certain attribute values from the elements for specified combination of [ProcessUnitModelScenario/#ProcessUnitId], [ProcessUnitModelScenario/#ScenarioId], and [Material/#Id].
The XML data is kept in the Air_Calc_Log table Verbose_Xml CLOB field.
In my PL/SQL I am (mis?)using the follow select:
SELECT
XMLType(l.verbose_xml).extract(
'for $scen in /Calculation/ProcessUnitModelScenario
where ($scen/#ScenarioId="10053")
return $scen/* ')
FROM air_calc_log l
WHERE l.vld_site_id = 10030 -- pVldSite
AND l.start_date = To_Date('01/01/2012','mm/dd/yyyy') -- pStartDate
AND l.End_Date = To_Date('04/01/2012','mm/dd/yyyy')
Whatever combination of XQuery/XPath using FLOWR syntax I use I always get the following error message:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00601: Invalid token in: 'for $scen in /Calculation/ProcessUnitModelScenario
where ($scen/#ScenarioId="10053")
return $scen/* '
ORA-06512: at "SYS.XMLTYPE", line 111
Can someone help point out what I'm doing wrong?
Try it like this:
SELECT
XMLType(l.verbose_xml).extract(
'/Calculation/ProcessUnitModelScenario[#ScenarioId="10053"]')
FROM air_calc_log l
WHERE l.vld_site_id = 10030 -- pVldSite
AND l.start_date = To_Date('01/01/2012','mm/dd/yyyy') -- pStartDate
AND l.End_Date = To_Date('04/01/2012','mm/dd/yyyy')
Here is a fiddle (Note that I had to change your XML to make it well-formed)

Resources