I researched delimit issue for a while and I pull useful codes here and there, but I can't quite put it together.
I'm trying to parse the string by word in SSIS and I NEED help on vb script component.
I need to delimit my column data to the following deliminator:
"AND","OR","**", ","
I have a table like this
ID Description
1 apple AND orange, tangerine
2 avocado OR guacamole AND pineapple OR fruit
3 watermelon ** melon
And I want to parse the data like this
ID Description
1 apple
1 orange
1 tangerine
2 avocado
2 guacamole
2 pineapple
2 fruit
3 watermelon
3 melon
Thank you.
In order to parte string by words is enought a combination of replace a split:
(I assume that you know how to take ID)
split(
replace(
replace (
replace( Description, "AND", ","),
"OR", ","
),
"**", ","
), ","
)
this return a array of elements as you ask for:
id = 2
a=my_previous_functions_combination("avocado OR guacamole AND pineapple OR fruit")
for each fruit in a
do something with id and fruit
next
so far, help with vb. I don't know what you want to do in SSIS: a calculated member? a named set? Extend fact table? Read second answer part:
Second part:
To convert a row in multiples row you need a script. You can find a good example in SSIS - Script Component, Split single row to multiple rows post.
Related
I am trying to build a string of values to be inserted into an SQL IN list. For example -
SELECT * FROM TABLE WHERE field IN ('AAA', 'BBB', 'CCC', 'DDD')
The list that I want needs to be constructed from values within a single column of my dataset but I'm struggling to find a way to concatenate those values.
My first thought was to use CASESTOVARS to put each of the values into columns prior to concat. This is simple but the number of cases is variable.
Is there a way to concat all fields without specifying?
Or is there a better way to go about this?
Unfortunately Python is not an option for me in this instance.
A simple sample dataset would be -
CasestoConcat
AAA
BBB
CCC
DDD
You can use the lag function for this.
First creating a bit of sample data to demonstrate on:
data list free/grp (F1) txt (a5).
begin data
1 "aaa" 1 "bb" 1 "cccc" 2 "d" 2 "ee" 2 "fff" 2 "ggggg" 3 "hh" 3 "iii"
end data.
Now the following code makes sure that rows that belong together are consecutive. You can also sort by any other relevant variable to keep the combined text in a specific order.
sort cases by grp.
string merged (A1000).
compute merged=txt.
if $casenum>1 and grp=lag(grp) merged=concat(rtrim(merged), " ", rtrim(lag(merged))).
exe.
At this point if you want to just keep the line that has all the concatenated texts, you can use this:
add files /file=* /by grp /last=lst.
select if lst=1.
exe.
I have an excel file with information about variables (excel1) and another one with information about lists (excel2).
In order to create a syntax to generate a new syntax to create VARIABLE and VALUES LABELS, I used solution proposed by #eli.k here.
But with this solution I have to have a dataset with lists so I could use it instead of writing it “by hand” (copy/paste) (here). One problem came with L2, which has 195 entries so the new create variable would need to be bigger that 20.000 characters (is this possible in SPSS?), appearing all in one line.
What I want to know is if it’s possible to use excel2 automatically in code, line by line.
Using the following code:
GET DATA
/TYPE=XLSX
/FILE=" D:\excel1.xlsx "
/SHEET=name 'Folha1'
/CELLRANGE=FULL
/READNAMES=ON
/DATATYPEMIN PERCENTAGE=95.0.
STRING cmd1 cmd2 (a200).
SORT CASES by List.
MATCH FILES /FILE=* /FIRST=first /LAST=last /BY List. /* marking first and last lines.
DO IF first.
COMPUTE cmd1="VARIABLE LABELS".
COMPUTE cmd2="VALUE LABELS".
END IF.
IF not first cmd1=concat(rtrim(cmd1), " "). /* "/" only appears from the second varname.
COMPUTE cmd1=concat(rtrim(cmd1), " ", Var_label).
COMPUTE cmd2=concat(rtrim(cmd2), " ", Var).
DO IF last.
COMPUTE cmd1=concat(rtrim(cmd1), ".").
COMPUTE cmd2=concat(rtrim(cmd2), " ",' 1 "Afghanistan" 2 "Albania" (…) 195 "Zimbabwe".').
END IF.
EXECUTE.
SELECT IF ('List' 'L2').
ADD FILES /file=* /rename cmd1=cmd /file=* /rename cmd2=cmd.
EXECUTE.
I would like to know if there is a way to replace ' 1 "Afghanistan" 2 "Albania" (…) 195 "Zimbabwe".'' by some function/procedure to grab information from excel2 concerning L2, and showing it line by line:
(…)
VARIABLE LABELS V2 "Country"
/ V3 "Country Mother"
/ V4 "Country Father".
VALUE LABELS V2
V3
V4
1 "Afghanistan"
2 "Albania"
(…)
195 "Zimbabwe".
Thanks for helping me!
This issue is pretty complex and would usually be beyond the scope of Stack-Overflow Q&A but here's my answer anyway:
First I recreate the parts of your example data concerning the value labels only:
data list list/var list (2a5).
begin data
"v1" "L1"
"v2" "L2"
"v3" "L2"
"v4" "L2"
end data.
dataset name xl1.
data list list/list (a5) nb (f5) nb_txt (a20).
begin data
"L1" 1 "Female"
"L1" 2 "Male"
"L2" 1 "Afghanistan"
"L2" 2 "Albania"
"L2" 43 "Israel"
"L2" 195 "Zimbabwe"
end data.
dataset name xl2.
data list list/v1 v2 v3 v4 (4f3).
begin data
1 1 2 3
2 2 2 43
1 2 1 195
end data.
dataset name gen.
Now to work:
The first part is to create a macro for each list of variable labels. since some of the lists are long, I use ADD Value labels separately for each value.
dataset activate xl2.
string cmd (a200) cmdFin (a20).
sort cases by list nb.
match files /file=* /by list /first=first /last=last.
compute cmd=concat("add value labels !1 ", string(nb,f6), " '", rtrim(nb_txt), "' .").
if first cmd=concat("define dolist_", list, " (!pos=!cmdend) ", rtrim(cmd)).
if last cmdFin=" !enddefine .".
write outfile="path\create value label macros.sps"/cmd/cmdfin.
exe.
insert file="path\create value label macros.sps".
After inserting the generated syntax a macro has been defined for each of the value lists. Now we create an additional syntax that will run the related macro for each of the variable names in the list:
dataset activate xl1.
string cmd (a200).
compute cmd=concat("dolist_", list, " ", var, " .").
write outfile="path\run value label macros.sps"/cmd.
exe.
Now we can actually try out the generated macros on our original data:
dataset activate gen.
insert file="path\run value label macros.sps".
UFT-vbScript- I am reading a getROproperty of a link from an application.And the link has 300 different values following similar class pattern for many links like [PDF LLC- USA , [PDF MMB CANADA ,[PDF MCCS AUSTRALIA ,[PDF SSC MEXICO. [PDF ACCS MEXICO My question here is I just want to display the country name removing the other associated strings .How will I achieve this progamatically using vbscript. one way of doing this is using SPLIT fxn , but the real question is how will one know which pattern to choose from .
You should always post a sample code which you have tried along with your question.
If there is always a space (" ") just before your country name, use space to split up like the below
strExtractedValue = "[PDF LLC- USA"
arrExtract = Split(strExtractedValue , " ")
strCountryName = arrExtract(2) ' the 3rd element
Problem
I need to insert text of arbitrary length ( # of lines ) into a template while maintaining an exact number of total lines.
Sample source data file:
You have a hold available for pickup as of 2012-01-13:
Title: Really Long Test Title Regarding Random Gibberish. Volume 1, A-B, United States
and affiliated territories, United Nations, countries of the world
Author: Barrel Roll Morton
Title: How to Compromise Free Speech Using Everyday Tools. Volume XXVI
Author: Lamar Smith
#end-of-record
You have a hold available for pickup as of 2012-01-13:
Title: Selling Out Democracy For Fun and Profit. Volume 1, A-B, United States
Author: Lamar Smith
Copy: 12
#end-of-record
Sample Template ( simplified for brevity ):
<%CUST-NAME%>
<%CUST-ADDR%>
<%CUST-CTY-ZIP%>
<%TITLES GO HERE%>
<%STORE-NAME%>
<%STORE-ADDR%>
<%STORE-CTY-ZIP%>
At this point I use bash's 'mapfile' to load the source file
record by record using the /^#end-of-file/ regex ...so far so good.
Then I pull predictable aspects of each record according to the line
on which they occur, then process the info using a series of sed
search replace statements.
The Hang-Up
So the problem is the unknown number of 'title' records that could occur.
How can I accommodate an unknown number of titles and always have output
of precisely 65 lines?
Given that title records always occur starting on line 8, I can pull the
titles easily with:
sed -n '8,$p' test-match.txt
However, how can I insert this within an allotted space, ex, between <%CUST-CTY-ZIP%> and <%STORE-NAME%> without pushing the store info out of place in the template?
My idea so far:
-first send the customer info through:
Ex.
sed 's/<%CUST-NAME%>/Benedict Arnold/' template.txt
-Append title records
???
-Then the store/location info
sed 's/<%STORE-NAME%>/Smith's House of Greasy Palms/' template.txt
I have code and functions for this stuff if interested but this post is 'windy' as it is.
Just need help with inserting the title records while maintaining position of following text and maintaining total line number of 65.*
UPDATE
I've decided to change tactics. I'm going to create place holders in the template for all available lines between customer and store info --- then:
Test if line is null in source
if yes -- replace placeholder with null leaving the line ending. Line number maintained.
if not null -- again, replace with text, maintaining line number and line endings in template.
Eventually, I plan to invest some time looking closer at Triplee's suggestion regarding Perl. The Perl way really does look simpler and easier to maintain if I'm going to be stuck with this project long term.
This might work for you:
cat <<! >titles.txt
> 1
> 2
> 3
> 4
> 5
> 6
> 7
> Title 1
> Title 2
> Title 3
> Title 4
> Title 5
> Title 6
> !
cat <<! >template.txt
> <%CUST-NAME%>
> <%CUST-ADDR%>
> <%CUST-CTY-ZIP%>
>
> <%TITLES GO HERE%>
>
> <%STORE-NAME%>
> <%STORE-ADDR%>
> <%STORE-CTY-ZIP%>
> !
sed '1,7d;:a;$!{N;ba};:b;G;s/\n[^\n]*//5g;tc;bb;:c;s/\n/\\n/g;s|.*|/<%TITLES GO HERE%>/c\\&|' titles.txt |
sed -f - template.txt
<%CUST-NAME%>
<%CUST-ADDR%>
<%CUST-CTY-ZIP%>
Title 1
Title 2
Title 3
Title 4
Title 5
<%STORE-NAME%>
<%STORE-ADDR%>
<%STORE-CTY-ZIP%>
This pads/squeezes the titles to 5 lines (s/\n[^\n]*//5g) if you want fewer or more change the 5 to the number desired.
This will give you five lines of output regardless of the number of lines in titles.txt:
sed -n '$s/$/\n\n\n\n\n/;8,$p' test-match.txt | head -n 5
Another version:
sed -n '8,$N; ${s/$/\n\n\n\n\n/;s/\(\([^\n]*\n\)\{4\}\).*/\1/p}' test-match.txt
Use one less than the number of lines you want (4 in this example will cause 5 lines of output).
Here's a quick proof of concept using Perl formats. If you are unfamiliar with Perl, I guess you will need some additional help with how to get the values from two different files, but it's quite doable, of course. Here, the data is simply embedded into the script itself.
I set the $titles format to 5 lines instead of the proper value (58 or something?) in order to make this easier to try out in a terminal window, and to demonstrate that the output is indeed truncated when it is longer than the allocated space.
#!/usr/bin/perl
use strict;
use warnings;
use vars (qw($cust_name $cust_addr $cust_cty_zip $titles
$store_name $store_addr $store_cty_zip));
my $fmtline = '#' . '<' x 78;
my $titlefmtline = '^' . '<' x 78;
my $empty = '';
my $fmt = join ("\n$fmtline\n", 'format STDOUT = ',
'$cust_name', '$cust_addr', '$cust_cty_zip', '$empty') .
("\n$titlefmtline\n" . '$titles') x 5 . #58
join ("\n$fmtline\n", '', '$empty',
'$store_name', '$store_addr', '$store_cty_zip');
#print $fmt;
eval "$fmt\n.\n";
titles = <<____HERE;
Title: Really Long Test Title Regarding Random Gibberish. Volume 1, A-B, United States
and affiliated territories, United Nations, countries of the world
Author: Barrel Roll Morton
Title: How to Compromise Free Speech Using Everyday Tools. Volume XXVI
Author: Lamar Smith
____HERE
# Preserve line breaks -- ^<< will fill lines, but preserves line breaks on \r
$titles =~ s/\n/\r\n/g;
while (<DATA>) {
chomp;
($cust_name, $cust_addr, $cust_cty_zip, $store_name, $store_addr, $store_cty_zip)
= split (",");
write STDOUT;
}
__END__
Charlie Bravo,23 Alpa St,Delta ND 12345,Spamazon,98 Spamway,Atlanta GA 98765
The use of $empty to get an empty line is pretty ugly, but I wanted to keep the format as regular as possible. I'm sure it could be avoided, but at the cost of additional code complexity IMHO.
If you are unfamiliar with Perl, the use strict is a complication, but a practical necessity; it requires you to declare your variables either with use vars or my. It is a best practice which helps immensely if you try to make changes to the script.
Here documents with <<HERE work like in shell scripts; it allows you to create a multi-line string easily.
The x operator is for repetition; 'string' x 3 is 'stringstringstring' and ("list") x 3 is ("list" "list" "list"). The dot operator is string concatenation; that is, "foo" . "bar" is "foobar".
Finally, the DATA filehandle allows you to put arbitrary data in the script file itself after the __END__ token which signals the end of the program code. For reading from standard input, use <> instead of <DATA>.
hello i have i want to do something like this.
i have 4 rows with unique id 1,2,3,4 all four rows contains some string like
option1,option2,option3,option4
now i want to add "a ) " to the option1, "b ) " to the option2 and so on so is there a way i can do this with a query.currently i am adding these to a lots of rows manually
It's not clear exactly by what logic you want to select the letter to prepend to field somestring, but if for example it's a "Caesar's cypher" (1 gives 'a', 2 gives 'b' etc) based on the id field, as your question suggests, then this should work:
UPDATE sometable
SET somestring = (
substr('abcdefghijklmnopqrstuvwxyz', id, 1) ||
' ) ' || somestring)
WHERE id <= 26;
...for no more than 26 rows of course, since beyond that the logic must change and obviously we can't guess just how you want to extend it (use id modulo 26 + 1, use more characters than just lowercase letters, or ...?) since you give no clue on why you want to do this.