In the below for loop, I am contactinting the value of ReferenceNumber and then after the loop I am doing substring till 140 characters. Let say the output of for loop give me the string having ~600 characters, so I want to split the string in 5 ustrd tag which will have 140 characters of output of for loop. To explain properly, I have put the details below. Pls help
If I want to generate next Ustrd tag which will start from 141st value to next 140 characters in the next tag and then similarly from 281st character to 420th character (i.e. each Ustrd tag should have 140 characters and after that it should go to next tag). Pls suggest how should I make this change –
Related
I would like to extract all occurrences of a URL string pattern (which can appear multiple times in a file) to build a list of all occurrences.
Currently I can identify each occurrence with the Find in files feature, but I would like the Extract feature to list each occurrence on a new line. Currently the feature lists each line that contains the string. And a line can contain the sting multiple times.
My goal is to get a list of the full URL that contains __data/assets/
In the below example __data/assets/ occurs 48 times.
However, the extract only 44 lines are extracted, but I need to output all 48 occurrences (the full URL).
I will be running this extract over 270 files in total.
View source of this example webpage:
https://www.walkerville.sa.gov.au/council/strategic-plans/2020-2024-living-in-the-town-of-walkerville-a-strategic-community-plan
It looks that all URLs are surrounded by double quotation marks.
If so, you can search for a regular expression:
[^\"]*__data/assets/[^\"]*
and select Display Matched Strings Only in the Extract Options dialog box.
I have a large, 4.5M+ row CSV (commas are the separators) containing tweets. The CSV comes from some time ago, and has all manner of line breaks inside column data, characters, etc. It is likely malformed in some ways but it is difficult for me to discern exactly where and how with a file of this size.
I want to move through this CSV file as a large body of text, pull out all the Tweet IDs, and put each pulled ID into a line in a new file.
Doing this via bash, perl, Python will work fine. Can anyone help here? I can't seem to even find info on the parameters for a tweet ID, though the ones in this corpus seem to all be 17 integers.
Since in your question the only evidence for a Tweet ID is that its an integer of length of 17, that is the only rule I am going to use.
Plus, I am going to use it as a hard-and-fast rule. Anything that is an integer of length is a Tweet ID, nothing else.
After that its a normal regular expression search.
import re
string = '''
12345678912345678, abcd, efgh
45645645645645645, ijkl, mnop
78944556677889900, qrst, uvwx
0, y, z
'''
m = re.findall('[0-9]{17}', string)
print(m)
re.findall searches for the regular expression (first arg) in the string (second argument)
(a):- [0-9] means any integer between 0 to 9
(b):- {m} means the regular exp. that preceded this must repeat m number of times
(a)+(b):- [0-9]{17} get me a match that has is a string of integers 0 to 9 repeated 17 times. i.e. a number of length 17
find out more about re module in python
This is as much I can help with you without knowing anything about the input file and tweet format.
I'm trying to enter a number into a field that starts with leading zeros. When this number gets entered, the leading zeros are removed.
Expected: 000632 Actual: 632
These are the last 6 digits of a longer number so I need the zero(s) there.
The paramenter that this number is used in has been converted to a string but it's still removing the zeros.
I do have a Transform block in my env file to automatically convert digits to integer:
Transform /^\d+$/ do |number|
number.to_i
end
Although the regex used for my string is ([^"]*/)
It looks like the Transform block is interfering. Is there a way around this? I'm no Regex master :-)
Thanks
Transform blocks are checked for match against every match extracted in your test steps. Because of this the string "000632" matches to the transform you posted and you get the integer extracted instead of the wanted string. If you want to prevent strings of digits beginning with 0 from matching you need to change your transform regex to something like
Transform /^[1-9]\d*$/ do |number|
number.to_i
end
which will then only match strings of digits beginning with 1-9
I have code like this :
FTL:
<#compress>
${doc["root/uniqCode"]}
</#compress>
Input is XML Nodemodel
The xml element is having data like: ID_234 567_89
When it is processed the out is: "ID_234 567_89"
The three white spaces between 234 and 567 is trimmed down to one white-space and lost all the white spaces at the end of the value.
I need the value as it is :"ID_234 567_89 "
When i removed the tags it works as expected irrespective of newFactory.setIgnoringElementContentWhitespace(true).
Why should tag trims data resulted from ${}?
Please help.
You could simply replace the characters you don't want manually (in the following example tabs, carriage returns and newlines), e.g.
${doc["root/uniqCode"]?replace("[\\t\\r\\n]", "", "rm")}
See ?replace built-in for strings: http://freemarker.org/docs/ref_builtins_string.html#ref_builtin_replace
So basically I have a record that looks like this
modulis = record
kodas : string[4];
pavadinimas : string[30];
skaicius : integer;
kiti : array[1..50] of string;
end;
And I'm trying to read it from the text file like this :
ReadLn(f1,N);
for i := 1 to N do
begin
Read(f1,moduliai[i].kodas);
Read(f1,moduliai[i].pavadinimas);
Read(f1,moduliai[i].skaicius);
for j := 1 to moduliai[i].skaicius do
Read(f1,moduliai[i].kiti[j]);
ReadLn(f1);
end;
And the file looks like this :
9
IF01 Programavimo ivadas 0
IF02 Diskrecioji matematika 1 IF01
IF03 Duomenu strukturos 2 IF01 IF02
IF04 Skaitmenine logika 0
IF05 Matematine logika 1 IF04
IF06 Operaciju optimizavimas 1 IF05
IF07 Algoritmu analize 2 IF03 IF06
IF08 Asemblerio kalba 1 IF03
IF09 Operacines sistemos 2 IF07 IF08
And I'm getting 106 bad numeric format. Can't figure out how to fix this, I'm not sure, but I think it has something to do with the text file, however I copied the text file from the internet so it has to be good :|
Reading string data is different from reading numeric data in Pascal.
With numbers the Read instruction consumes data until it hits white space or the end of file. Now white space in this case can be the space character, the tab character, the EOL 'character'. So if there are 2 numbers on one line of text, you could read them one by one using two consecutive Reads.
I believe you have already known that.
And I believe you thought it would work the same with strings. But it won't, you cannot read two string values from one line of text simply by using two consecutive Read instructions. Read would consume all the text up to EOL or EOF. After the reading the string variable is assigned however many characters it can hold, the rest of the data being thrown out into oblivion. It is essentially equivalent to ReadLn in this respect.
Solution? Arrange all the data in the input file on separate lines and better use ReadLns instead of all the Reads. (But I think the latter might be unnecessary, and rearranging the input data might be enough.)
Alternatively you would need to read the whole line of text into a temporary string variable, then split it manually and assign the parts to the corresponding record fields, not forgetting also to convert the numeric values from string to integer.
You choose what suits you better.
Because you have declared pavadinimas as string[30], it reads 30 character no matter what is the length of the string. For example in the following line pavadinimas will be
" Skaitmenine logika 0" instead of just "Skaitmenine logika"
IF04 Skaitmenine logika 0
I'm not a Pascal programmer, but it looks like the fields within your text file are not fixed length. How would you expect your program to delimit each field during read back?