Related
I have the following code:
$DatabaseSettings = #();
$NewDatabaseSetting = "" | select DatabaseName, DataFile, LogFile, LiveBackupPath;
$NewDatabaseSetting.DatabaseName = "LiveEmployees_PD";
$NewDatabaseSetting.DataFile = "LiveEmployees_PD_Data";
$NewDatabaseSetting.LogFile = "LiveEmployees_PD_Log";
$NewDatabaseSetting.LiveBackupPath = '\\LiveServer\LiveEmployeesBackups';
$DatabaseSettings += $NewDatabaseSetting;
When I try to use one of the properties in a string execute command:
& "$SQlBackupExePath\SQLBackupC.exe" -I $InstanceName -SQL `
"RESTORE DATABASE $DatabaseSettings[0].DatabaseName FROM DISK = '$tempPath\$LatestFullBackupFile' WITH NORECOVERY, REPLACE, MOVE '$DataFileName' TO '$DataFilegroupFolder\$DataFileName.mdf', MOVE '$LogFileName' TO '$LogFilegroupFolder\$LogFileName.ldf'"
It tries to just use the value of $DatabaseSettings rather than the value of $DatabaseSettings[0].DatabaseName, which is not valid.
My workaround is to have it copied into a new variable.
How can I access the object's property directly in a double-quoted string?
When you enclose a variable name in a double-quoted string it will be replaced by that variable's value:
$foo = 2
"$foo"
becomes
"2"
If you don't want that you have to use single quotes:
$foo = 2
'$foo'
However, if you want to access properties, or use indexes on variables in a double-quoted string, you have to enclose that subexpression in $():
$foo = 1,2,3
"$foo[1]" # yields "1 2 3[1]"
"$($foo[1])" # yields "2"
$bar = "abc"
"$bar.Length" # yields "abc.Length"
"$($bar.Length)" # yields "3"
PowerShell only expands variables in those cases, nothing more. To force evaluation of more complex expressions, including indexes, properties or even complete calculations, you have to enclose those in the subexpression operator $( ) which causes the expression inside to be evaluated and embedded in the string.
#Joey has the correct answer, but just to add a bit more as to why you need to force the evaluation with $():
Your example code contains an ambiguity that points to why the makers of PowerShell may have chosen to limit expansion to mere variable references and not support access to properties as well (as an aside: string expansion is done by calling the ToString() method on the object, which can explain some "odd" results).
Your example contained at the very end of the command line:
...\$LogFileName.ldf
If properties of objects were expanded by default, the above would resolve to
...\
since the object referenced by $LogFileName would not have a property called ldf, $null (or an empty string) would be substituted for the variable.
Documentation note: Get-Help about_Quoting_Rules covers string interpolation, but, as of PSv5, not in-depth.
To complement Joey's helpful answer with a pragmatic summary of PowerShell's string expansion (string interpolation in double-quoted strings ("...", a.k.a. expandable strings), including in double-quoted here-strings):
Only references such as $foo, $global:foo (or $script:foo, ...) and $env:PATH (environment variables) can directly be embedded in a "..." string - that is, only the variable reference itself, as a whole is expanded, irrespective of what follows.
E.g., "$HOME.foo" expands to something like C:\Users\jdoe.foo, because the .foo part was interpreted literally - not as a property access.
To disambiguate a variable name from subsequent characters in the string, enclose it in { and }; e.g., ${foo}.
This is especially important if the variable name is followed by a :, as PowerShell would otherwise consider everything between the $ and the : a scope specifier, typically causing the interpolation to fail; e.g., "$HOME: where the heart is." breaks, but "${HOME}: where the heart is." works as intended.
(Alternatively, `-escape the :: "$HOME`: where the heart is.", but that only works if the character following the variable name wouldn't then accidentally form an escape sequence with a preceding `, such as `b - see the conceptual about_Special_Characters help topic).
To treat a $ or a " as a literal, prefix it with escape char. ` (a backtick); e.g.:
"`$HOME's value: $HOME"
For anything else, including using array subscripts and accessing an object variable's properties, you must enclose the expression in $(...), the subexpression operator (e.g., "PS version: $($PSVersionTable.PSVersion)" or "1st el.: $($someArray[0])")
Using $(...) even allows you to embed the output from entire commands in double-quoted strings (e.g., "Today is $((Get-Date).ToString('d')).").
Interpolation results don't necessarily look the same as the default output format (what you'd see if you printed the variable / subexpression directly to the console, for instance, which involves the default formatter; see Get-Help about_format.ps1xml):
Collections, including arrays, are converted to strings by placing a single space between the string representations of the elements (by default; a different separator can be specified by setting preference variable $OFS, though that is rarely seen in practice) E.g., "array: $(#(1, 2, 3))" yields array: 1 2 3
Instances of any other type (including elements of collections that aren't themselves collections) are stringified by either calling the IFormattable.ToString() method with the invariant culture, if the instance's type supports the IFormattable interface[1], or by calling .psobject.ToString(), which in most cases simply invokes the underlying .NET type's .ToString() method[2], which may or may not give a meaningful representation: unless a (non-primitive) type has specifically overridden the .ToString() method, all you'll get is the full type name (e.g., "hashtable: $(#{ key = 'value' })" yields hashtable: System.Collections.Hashtable).
To get the same output as in the console, use a subexpression in which you pipe to Out-String and apply .Trim() to remove any leading and trailing empty lines, if desired; e.g.,
"hashtable:`n$((#{ key = 'value' } | Out-String).Trim())" yields:
hashtable:
Name Value
---- -----
key value
[1] This perhaps surprising behavior means that, for types that support culture-sensitive representations, $obj.ToString() yields a current-culture-appropriate representation, whereas "$obj" (string interpolation) always results in a culture-invariant representation - see this answer.
[2] Notable overrides:
• The previously discussed stringification of collections (space-separated list of elements rather than something like System.Object[]).
• The hashtable-like representation of [pscustomobject] instances (explained here) rather than the empty string.
#Joey has a good answer. There is another way with a more .NET look with a String.Format equivalent, I prefer it when accessing properties on objects:
Things about a car:
$properties = #{ 'color'='red'; 'type'='sedan'; 'package'='fully loaded'; }
Create an object:
$car = New-Object -typename psobject -Property $properties
Interpolate a string:
"The {0} car is a nice {1} that is {2}" -f $car.color, $car.type, $car.package
Outputs:
# The red car is a nice sedan that is fully loaded
If you want to use properties within quotes follow as below. You have to use $ outside of the bracket to print property.
$($variable.property)
Example:
$uninstall= Get-WmiObject -ClassName Win32_Product |
Where-Object {$_.Name -like "Google Chrome"
Output:
IdentifyingNumber : {57CF5E58-9311-303D-9241-8CB73E340963}
Name : Google Chrome
Vendor : Google LLC
Version : 95.0.4638.54
Caption : Google Chrome
If you want only name property then do as below:
"$($uninstall.name) Found and triggered uninstall"
Output:
Google Chrome Found and triggered uninstall
I would like to find a regex that will pick out all commas that fall outside quote sets.
For example:
'foo' => 'bar',
'foofoo' => 'bar,bar'
This would pick out the single comma on line 1, after 'bar',
I don't really care about single vs double quotes.
Has anyone got any thoughts? I feel like this should be possible with readaheads, but my regex fu is too weak.
This will match any string up to and including the first non-quoted ",". Is that what you are wanting?
/^([^"]|"[^"]*")*?(,)/
If you want all of them (and as a counter-example to the guy who said it wasn't possible) you could write:
/(,)(?=(?:[^"]|"[^"]*")*$)/
which will match all of them. Thus
'test, a "comma,", bob, ",sam,",here'.gsub(/(,)(?=(?:[^"]|"[^"]*")*$)/,';')
replaces all the commas not inside quotes with semicolons, and produces:
'test; a "comma,"; bob; ",sam,";here'
If you need it to work across line breaks just add the m (multiline) flag.
The below regexes would match all the comma's which are present outside the double quotes,
,(?=(?:[^"]*"[^"]*")*[^"]*$)
DEMO
OR(PCRE only)
"[^"]*"(*SKIP)(*F)|,
"[^"]*" matches all the double quoted block. That is, in this buz,"bar,foo" input, this regex would match "bar,foo" only. Now the following (*SKIP)(*F) makes the match to fail. Then it moves on to the pattern which was next to | symbol and tries to match characters from the remaining string. That is, in our output , next to pattern | will match only the comma which was just after to buz . Note that this won't match the comma which was present inside double quotes, because we already make the double quoted part to skip.
DEMO
The below regex would match all the comma's which are present inside the double quotes,
,(?!(?:[^"]*"[^"]*")*[^"]*$)
DEMO
While it's possible to hack it with a regex (and I enjoy abusing regexes as much as the next guy), you'll get in trouble sooner or later trying to handle substrings without a more advanced parser. Possible ways to get in trouble include mixed quotes, and escaped quotes.
This function will split a string on commas, but not those commas that are within a single- or double-quoted string. It can be easily extended with additional characters to use as quotes (though character pairs like « » would need a few more lines of code) and will even tell you if you forgot to close a quote in your data:
function splitNotStrings(str){
var parse=[], inString=false, escape=0, end=0
for(var i=0, c; c=str[i]; i++){ // looping over the characters in str
if(c==='\\'){ escape^=1; continue} // 1 when odd number of consecutive \
if(c===','){
if(!inString){
parse.push(str.slice(end, i))
end=i+1
}
}
else if(splitNotStrings.quotes.indexOf(c)>-1 && !escape){
if(c===inString) inString=false
else if(!inString) inString=c
}
escape=0
}
// now we finished parsing, strings should be closed
if(inString) throw SyntaxError('expected matching '+inString)
if(end<i) parse.push(str.slice(end, i))
return parse
}
splitNotStrings.quotes="'\"" // add other (symmetrical) quotes here
Try this regular expression:
(?:"(?:[^\\"]+|\\(?:\\\\)*[\\"])*"|'(?:[^\\']+|\\(?:\\\\)*[\\'])*')\s*=>\s*(?:"(?:[^\\"]+|\\(?:\\\\)*[\\"])*"|'(?:[^\\']+|\\(?:\\\\)*[\\'])*')\s*,
This does also allow strings like “'foo\'bar' => 'bar\\',”.
MarkusQ's answer worked great for me for about a year, until it didn't. I just got a stack overflow error on a line with about 120 commas and 3682 characters total. In Java, like this:
String[] cells = line.split("[\t,](?=(?:[^\"]|\"[^\"]*\")*$)", -1);
Here's my extremely inelegant replacement that doesn't stack overflow:
private String[] extractCellsFromLine(String line) {
List<String> cellList = new ArrayList<String>();
while (true) {
String[] firstCellAndRest;
if (line.startsWith("\"")) {
firstCellAndRest = line.split("([\t,])(?=(?:[^\"]|\"[^\"]*\")*$)", 2);
}
else {
firstCellAndRest = line.split("[\t,]", 2);
}
cellList.add(firstCellAndRest[0]);
if (firstCellAndRest.length == 1) {
break;
}
line = firstCellAndRest[1];
}
return cellList.toArray(new String[cellList.size()]);
}
#SocialCensus, The example you gave in the comment to MarkusQ, where you throw in ' alongside the ", doesn't work with the example MarkusQ gave right above that if we change sam to sam's: (test, a "comma,", bob, ",sam's,",here) has no match against (,)(?=(?:[^"']|["|'][^"']")$). In fact, the problem itself, "I don't really care about single vs double quotes", is ambiguous. You have to be clear what you mean by quoting either with " or with '. For example, is nesting allowed or not? If so, to how many levels? If only 1 nested level, what happens to a comma outside the inner nested quotation but inside the outer nesting quotation? You should also consider that single quotes happen by themselves as apostrophes (ie, like the counter-example I gave earlier with sam's). Finally, the regex you made doesn't really treat single quotes on par with double quotes since it assumes the last type of quotation mark is necessarily a double quote -- and replacing that last double quote with ['|"] also has a problem if the text doesn't come with correct quoting (or if apostrophes are used), though, I suppose we probably could assume all quotes are correctly delineated.
MarkusQ's regexp answers the question: find all commas that have an even number of double quotes after it (ie, are outside double quotes) and disregard all commas that have an odd number of double quotes after it (ie, are inside double quotes). This is generally the same solution as what you probably want, but let's look at a few anomalies. First, if someone leaves off a quotation mark at the end, then this regexp finds all the wrong commas rather than finding the desired ones or failing to match any. Of course, if a double quote is missing, all bets are off since it might not be clear if the missing one belongs at the end or instead belongs at the beginning; however, there is a case that is legitimate and where the regex could conceivably fail (this is the second "anomaly"). If you adjust the regexp to go across text lines, then you should be aware that quoting multiple consecutive paragraphs requires that you place a single double quote at the beginning of each paragraph and leave out the quote at the end of each paragraph except for at the end of the very last paragraph. This means that over the space of those paragraphs, the regex will fail in some places and succeed in others.
Examples and brief discussions of paragraph quoting and of nested quoting can be found here http://en.wikipedia.org/wiki/Quotation_mark .
I would like to find a regex that will pick out all commas that fall outside quote sets.
For example:
'foo' => 'bar',
'foofoo' => 'bar,bar'
This would pick out the single comma on line 1, after 'bar',
I don't really care about single vs double quotes.
Has anyone got any thoughts? I feel like this should be possible with readaheads, but my regex fu is too weak.
This will match any string up to and including the first non-quoted ",". Is that what you are wanting?
/^([^"]|"[^"]*")*?(,)/
If you want all of them (and as a counter-example to the guy who said it wasn't possible) you could write:
/(,)(?=(?:[^"]|"[^"]*")*$)/
which will match all of them. Thus
'test, a "comma,", bob, ",sam,",here'.gsub(/(,)(?=(?:[^"]|"[^"]*")*$)/,';')
replaces all the commas not inside quotes with semicolons, and produces:
'test; a "comma,"; bob; ",sam,";here'
If you need it to work across line breaks just add the m (multiline) flag.
The below regexes would match all the comma's which are present outside the double quotes,
,(?=(?:[^"]*"[^"]*")*[^"]*$)
DEMO
OR(PCRE only)
"[^"]*"(*SKIP)(*F)|,
"[^"]*" matches all the double quoted block. That is, in this buz,"bar,foo" input, this regex would match "bar,foo" only. Now the following (*SKIP)(*F) makes the match to fail. Then it moves on to the pattern which was next to | symbol and tries to match characters from the remaining string. That is, in our output , next to pattern | will match only the comma which was just after to buz . Note that this won't match the comma which was present inside double quotes, because we already make the double quoted part to skip.
DEMO
The below regex would match all the comma's which are present inside the double quotes,
,(?!(?:[^"]*"[^"]*")*[^"]*$)
DEMO
While it's possible to hack it with a regex (and I enjoy abusing regexes as much as the next guy), you'll get in trouble sooner or later trying to handle substrings without a more advanced parser. Possible ways to get in trouble include mixed quotes, and escaped quotes.
This function will split a string on commas, but not those commas that are within a single- or double-quoted string. It can be easily extended with additional characters to use as quotes (though character pairs like « » would need a few more lines of code) and will even tell you if you forgot to close a quote in your data:
function splitNotStrings(str){
var parse=[], inString=false, escape=0, end=0
for(var i=0, c; c=str[i]; i++){ // looping over the characters in str
if(c==='\\'){ escape^=1; continue} // 1 when odd number of consecutive \
if(c===','){
if(!inString){
parse.push(str.slice(end, i))
end=i+1
}
}
else if(splitNotStrings.quotes.indexOf(c)>-1 && !escape){
if(c===inString) inString=false
else if(!inString) inString=c
}
escape=0
}
// now we finished parsing, strings should be closed
if(inString) throw SyntaxError('expected matching '+inString)
if(end<i) parse.push(str.slice(end, i))
return parse
}
splitNotStrings.quotes="'\"" // add other (symmetrical) quotes here
Try this regular expression:
(?:"(?:[^\\"]+|\\(?:\\\\)*[\\"])*"|'(?:[^\\']+|\\(?:\\\\)*[\\'])*')\s*=>\s*(?:"(?:[^\\"]+|\\(?:\\\\)*[\\"])*"|'(?:[^\\']+|\\(?:\\\\)*[\\'])*')\s*,
This does also allow strings like “'foo\'bar' => 'bar\\',”.
MarkusQ's answer worked great for me for about a year, until it didn't. I just got a stack overflow error on a line with about 120 commas and 3682 characters total. In Java, like this:
String[] cells = line.split("[\t,](?=(?:[^\"]|\"[^\"]*\")*$)", -1);
Here's my extremely inelegant replacement that doesn't stack overflow:
private String[] extractCellsFromLine(String line) {
List<String> cellList = new ArrayList<String>();
while (true) {
String[] firstCellAndRest;
if (line.startsWith("\"")) {
firstCellAndRest = line.split("([\t,])(?=(?:[^\"]|\"[^\"]*\")*$)", 2);
}
else {
firstCellAndRest = line.split("[\t,]", 2);
}
cellList.add(firstCellAndRest[0]);
if (firstCellAndRest.length == 1) {
break;
}
line = firstCellAndRest[1];
}
return cellList.toArray(new String[cellList.size()]);
}
#SocialCensus, The example you gave in the comment to MarkusQ, where you throw in ' alongside the ", doesn't work with the example MarkusQ gave right above that if we change sam to sam's: (test, a "comma,", bob, ",sam's,",here) has no match against (,)(?=(?:[^"']|["|'][^"']")$). In fact, the problem itself, "I don't really care about single vs double quotes", is ambiguous. You have to be clear what you mean by quoting either with " or with '. For example, is nesting allowed or not? If so, to how many levels? If only 1 nested level, what happens to a comma outside the inner nested quotation but inside the outer nesting quotation? You should also consider that single quotes happen by themselves as apostrophes (ie, like the counter-example I gave earlier with sam's). Finally, the regex you made doesn't really treat single quotes on par with double quotes since it assumes the last type of quotation mark is necessarily a double quote -- and replacing that last double quote with ['|"] also has a problem if the text doesn't come with correct quoting (or if apostrophes are used), though, I suppose we probably could assume all quotes are correctly delineated.
MarkusQ's regexp answers the question: find all commas that have an even number of double quotes after it (ie, are outside double quotes) and disregard all commas that have an odd number of double quotes after it (ie, are inside double quotes). This is generally the same solution as what you probably want, but let's look at a few anomalies. First, if someone leaves off a quotation mark at the end, then this regexp finds all the wrong commas rather than finding the desired ones or failing to match any. Of course, if a double quote is missing, all bets are off since it might not be clear if the missing one belongs at the end or instead belongs at the beginning; however, there is a case that is legitimate and where the regex could conceivably fail (this is the second "anomaly"). If you adjust the regexp to go across text lines, then you should be aware that quoting multiple consecutive paragraphs requires that you place a single double quote at the beginning of each paragraph and leave out the quote at the end of each paragraph except for at the end of the very last paragraph. This means that over the space of those paragraphs, the regex will fail in some places and succeed in others.
Examples and brief discussions of paragraph quoting and of nested quoting can be found here http://en.wikipedia.org/wiki/Quotation_mark .
I am having a string as below:
str1='"{\"#Network\":{\"command\":\"Connect\",\"data\":
{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"'
I wanted to extract the somename string from the above string. Values of xx:xx:xx:xx:xx:xx, somename and 123456789 can change but the syntax will remain same as above.
I saw similar posts on this site but don't know how to use regex in the above case.
Any ideas how to extract the above string.
Parse the string to JSON and get the values that way.
require 'json'
str = "{\"#Network\":{\"command\":\"Connect\",\"data\":{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"
json = JSON.parse(str.strip)
name = json["#Network"]["data"]["Name"]
pwd = json["#Network"]["data"]["Pwd"]
Since you don't know regex, let's leave them out for now and try manual parsing which is a bit easier to understand.
Your original input, without the outer apostrophes and name of variable is:
"{\"#Network\":{\"command\":\"Connect\",\"data\":{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"
You say that you need to get the 'somename' value and that the 'grammar will not change'. Cool!.
First, look at what delimits that value: it has quotes, then there's a colon to the left and comma to the right. However, looking at other parts, such layout is also used near the command and near the pwd. So, colon-quote-data-quote-comma is not enough. Looking further to the sides, there's a \"Name\". It never occurs anywhere in the input data except this place. This is just great! That means, that we can quickly find the whereabouts of the data just by searching for the \"Name\" text:
inputdata = .....
estposition = inputdata.index('\"Name\"')
raise "well-known marker wa not found in the input" unless estposition
now, we know:
where the part starts
and that after the "Name" text there's always a colon, a quote, and then the-interesting-data
and that there's always a quote after the interesting-data
let's find all of them:
colonquote = inputdata.index(':\"', estposition)
datastart = colonquote+3
lastquote = inputdata.index('\"', datastart)
dataend = lastquote-1
The index returns the start position of the match, so it would return the position of : and position of \. Since we want to get the text between them, we must add/subtract a few positions to move past the :\" at begining or move back from \" at end.
Then, fetch the data from between them:
value = inputdata[datastart..dataend]
And that's it.
Now, step back and look at the input data once again. You say that grammar is always the same. The various bits are obviously separated by colons and commas. Let's try using it directly:
parts = inputdata.split(/[:,]/)
=> ["\"{\\\"#Network\\\"",
"{\\\"command\\\"",
"\\\"Connect\\\"",
"\\\"data\\\"",
"\n{\\\"Id\\\"",
"\\\"xx",
"xx",
"xx",
"xx",
"xx",
"xx\\\"",
"\\\"Name\\\"",
"\\\"somename\\\"",
"\\\"Pwd\\\"",
"\\\"123456789\\\"}}}\\0\""]
Please ignore the regex for now. Just assume it says a colon or comma. Now, in parts you will get all the, well, parts, that were detected by cutting the inputdata to pieces at every colon or comma.
If the layout never changes and is always the same, then your interesting-data will be always at place 13th:
almostvalue = parts[12]
=> "\\\"somename\\\""
Now, just strip the spurious characters. Since the grammar is constant, there's 2 chars to be cut from both sides:
value = almostvalue[2..-3]
Ok, another way. Since regex already showed up, let's try with them. We know:
data is prefixed with \"Name\" then colon and slash-quote
data consists of some text without quotes inside (well, at least I guess so)
data ends with a slash-quote
the parts in regex syntax would be, respectively:
\"Name\":\"
[^\"]*
\"
together:
inputdata =~ /\\"Name\\":\\"([^\"]*)\\"/
value = $1
Note that I surrounded the interesting part with (), hence after sucessful match that part is available in the $1 special variable.
Yet another way:
If you look at the grammar carefully, it really resembles a set of embedded hashes:
\"
{ \"#Network\" :
{ \"command\" : \"Connect\",
\"data\" :
{ \"Id\" : \"xx:xx:xx:xx:xx:xx\",
\"Name\" : \"somename\",
\"Pwd\" : \"123456789\"
}
}
}
\0\"
If we'd write something similar as Ruby hashes:
{ "#Network" =>
{ "command" => "Connect",
"data" =>
{ "Id" => "xx:xx:xx:xx:xx:xx",
"Name" => "somename",
"Pwd" => "123456789"
}
}
}
What's the difference? the colon was replaced with =>, and the slashes-before-quotes are gone. Oh, and also opening/closing \" is gone and that \0 at the end is gone too. Let's play:
tmp = inputdata[2..-4] # remove opening \" and closing \0\"
tmp.gsub!('\"', '"') # replace every \" with just "
Now, what about colons.. We cannot just replace : with =>, because it would damage the internal colons of the xx:xx:xx:xx:xx:xx part.. But, look: all the other colons have always a quote before them!
tmp.gsub!('":', '"=>') # replace every quote-colon with quote-arrow
Now our tmp is:
{"#Network"=>{"command"=>"Connect","data"=>{"Id"=>"xx:xx:xx:xx:xx:xx","Name"=>"somename","Pwd"=>"123456789"}}}
formatted a little:
{ "#Network"=>
{ "command"=>"Connect",
"data"=>
{ "Id"=>"xx:xx:xx:xx:xx:xx","Name"=>"somename","Pwd"=>"123456789" }
}
}
So, it looks just like a Ruby hash. Let's try 'destringizing' it:
packeddata = eval(tmp)
value = packeddata['#Network']['data']['Name']
Done.
Well, this has grown a bit and Jonas was obviously faster, so I'll leave the JSON part to him since he wrote it already ;) The data was so similar to Ruby hash because it was obviously formatted as JSON which is a hash-like structure too. Using the proper format-reading tools is usually the best idea, but mind that the JSON library when asked to read the data - will read all of the data and then you can ask them "what was inside at the key xx/yy/zz", just like I showed you with the read-it-as-a-Hash attempt. Sometimes when your program is very short on the deadline, you cannot afford to read-it-all. Then, scanning with regex or scanning manually for "known markers" may (not must) be much faster and thus prefereable. But, still, much less convenient. Have fun.
I have a variable like book_file_name which stores a filename with path like this:
book_file_name
=> "./download/Access\\ Database\\ Design\\ \\&\\ Programming,\\ 3rd\\ Edition.PDF"
puts book_file_name
./download/Access\ Database\ Design\ \&\ Programming,\ 3rd\ Edition.PDF
=> nil
book_file_name.length
=> 71
When I use File.exists? to check the file, something is wrong.
This is how I use the string:
File.exists?("./download/Access\ Database\ Design\ \&\ Programming,\ 3rd\ Edition.PDF")
=> true
This is how I use the variable:
File.exists?(book_file_name)
=> false
What's wrong with the variable?
The string
"./download/Access\ Database\ Design\ \&\ Programming,\ 3rd\ Edition.PDF"
is in double-quotes, which causes the backslash+space to be replaced with space
This won't happen with a string variable like book_file_name, and won't happen in a string enclosed within single quotes.
I can see the actual book name with path is
'./download/Access Database Design & Programming, 3rd Edition.PDF'
so
File.exists?('./download/Access Database Design & Programming, 3rd Edition.PDF')
File.exists?("./download/Access Database Design & Programming, 3rd Edition.PDF")
book_file_name = './download/Access Database Design & Programming, 3rd Edition.PDF'
File.exists?(bookfilename)
book_file_name = "./download/Access Database Design & Programming, 3rd Edition.PDF"
File.exists?(bookfilename)
will all work just fine... so you're better off not using backslashes.
As you have shown in your code snippets, the string contained in your variable has backslashes in it. You don't need to escape the spaces, but if you do, you only need to escape them with one backslash. As it stands, you are using double backslashes; the first backslash escapes the second, and has no impact on the space.
puts "file name with spaces"
# => file name with spaces
puts "file\ name\ with\ spaces"
# => file name with spaces
puts "file\\ name\\ with\\ spaces"
# => file\ name\ with\ spaces
This explains why your string literal succeeds where your variable fails: the two strings are not equivalent. So just store the same string literal that succeeded (the one with single backslashes) or else the string literal without any backslashes and you should be good to go.