Is there an api in windows that retrieves the server name from a UNC path ? (\\server\share)
Or do i need to make my own ?
I found PathStripToRoot but it doesn't do the trick.
I don't know of a Win32 API for parsing a UNC path; however you should check for:
\\computername\share
\\?\UNC\computername\share (people use this to access long paths > 260 chars)
You can optionally also handle this case: smb://computername/share and this case hostname:/directorypath/resource
Read here for more information
This is untested, but maybe a combination of PathIsUNC() and PathFindNextComponent() would do the trick.
I don't know if there is a specific API for this, I would just implement the simple string handling on my own (skip past "\\" or return null, look for next \ or end of string and return that substring) possibly calling PathIsUNC() first
If you'll be receiving the data as plain text you should be able to parse it with a simple regex, not sure what language you use but I tend to use perk for quick searches like this. Supposing you have a large document containing multiple lines containing one path per line you can search on \\'s I.e
m/\\\\([0-9][0-9][0-9]\.(repeat 3 times, of course not recalling ip address requirements you might need to modify the first one for sure) then\\)? To make it optional and include the trailing slash, and finally (.*)\\/ig it's rough but should do the trick, and the path name should be in $2 for use!
I hope that was clear enough!
Related
My regexp behaves just like I want it to on http://regexr.com, but not like I want it in irb.
I'm trying to make a regular expression that will match the following:
A forward slash,
then 2 * any number of random characters (i.e. `.*`),
up to but not including another /
OR the end of the string (whichever comes first)
I'm sorry as that was probably unclear, but it's my best attempt at an English translation.
Here's my current attempt and hopefully that will give you a better idea of what I'm trying to do:
/(\/.*?(?=\/|$)){2}/
The usage scenario is I want to be able to take a path like /foo/bar/baz/bin/bash and shorten it to the level I'm at in the filesystem, in this case the second level (/foo/bar). I'm trying to do this using the command path.scan(-regex-).shift.
The usage scenario is I want to be able to take a path like /foo/bar/baz/bin/bash and shorten it to the level I'm at in the filesystem, in this case the second level (/foo/bar)
Ruby already has a class for handling paths, Pathname. You can use Pathname#relative_path_from to do what you want.
require 'pathname'
path = Pathname.new("/foo/bar/baz/bin/bash")
# Normally you'd use Pathname.getwd
cwd = Pathname.new("/foo/bar")
# baz/bin/bash
puts path.relative_path_from(cwd)
Regexes just invite problems, like assuming the path separator is /, not honoring escapes, and not dealing with extra /. For example, "//foo/bar//b\\/az/bin/bash". // is particularly common in code which joins together directories using paths.join("/") or "#{dir}/#{file}.
For completeness, the general way you match a single piece of a path is this.
%r{^(/[^/]+)}
That's the beginning of the string, a /, then 1 or more characters which are not /. Using [^/]+ means you don't have to try and match an optional / or end of string, a very useful technique. Using %r{} means less leaning toothpicks.
But this is only applicable to a canonicalized path. It will fail on //foo//b\\/ar/. You can try to fix up the regex to deal with that, or do your own canonicalization, but just use Pathname.
I'm new to regex's and Sublime's and am having issues trying to do a find/replace on all email addresses in a csv file.
I thought it would be reasonably straightforward but seem to be heading down the rabbit hole at a great rate of knots.
Data looks like;
data,data,email#address.com,data,data etc NB: there are about 100 fields per record and about 300 records
My thought was to look for the # symbol, then go left and right until I get to the comma and then replace with my new email address but I just can't get a win.
Any thoughts or am I using the wrong tool for the job?
(Also tagging with Ruby as if I need to do some scripting then I'll try to get figure it out in Ruby)
Thanks,
Liam
user2141046's expression won't find an email address like- "a.b#c.com"
I would suggest using:
[a-zA-Z0-9.!#$%&'+-/=?\^_`{|}~-]+#[a-zA-Z0-9-]+(?:.[a-zA-Z0-9-]+)
Source
I'm not familiar with the ruby language, but a regex that finds what you want is:
\w+\#\w+\.\w+
with the \. maybe unneeded (depending on language).
a perl one-liner that does the exact thing:
perl -pi -e 's/\w+\#\w+\.\w+/<your new email here>/g' <csv file here>
note
make sure you use \# in the enw email in the one liner i wrote, meaning new_email\#server.com
Try this:
[a-zA-Z0-9.!#$%&'*+-/=?\^_`{|}~-]+#[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*
It worked perfectly on a very long csv file filled with emails and all other kinds of stuff.
[a-zA-Z0-9.!#$%&'+-/=?\^_`{|}~-]+#[a-zA-Z0-9-]+(?:.[a-zA-Z0-9-]+)
will not work fine, because some domains have 2 or more levels (like com.br)
Use:
[a-zA-Z0-9.!#$%&'+-/=?\^_`{|}~-]+#[a-zA-Z0-9-]+(?:.[\.a-zA-Z0-9-]+)
I am trying to use Windows API functions compatible with Windows XP and up to find the target of a junction or symbolic link. I am using CreateFile to get a handle to the reparse point, then DeviceIoControl with the FSCTL_GET_REPARSE_POINT flag to read the reparse data into a REPARSE_DATA_BUFFER. Then, I use the offsets and lengths in the buffer to extract the SubstituteName and PrintName strings.
In Windows 8, extracting the PrintName works perfectly, giving me a normal path (ie c:\filename.ext), but in XP the PrintName section of the REPARSE_DATA_BUFFER seems to always have a length of 0, leaving me with an empty string.
Using the SubsituteName seems to work in both, but I always end up with a prefix of \??\ on the beginning of the file path (ie \??\c:\filename.ext). (as a side note, fsutil reparsepoint query shows the \??\ prefix as well).
I've read through much of the documentation on MSDN, but I can't find any explanation of this prefix. If the prefix is guaranteed to begin every SubstituteName, then I can just exclude the first four characters when I copy the file path from the buffer, but I'm not sure that this is the case. I would love to know if the "\??\" prefix appears in the SubstituteName for all Microsoft reparse points and why.
The Windows kernel has a "DOS Devices namespace" \DosDevices\ which is basically where anything you can open with CreateFile resides. (QueryDosDevice is a function which gives you all the members of that namespace.)
Because it's such a commonly used path, \??\ also redirects to that namespace. So, to the kernel, the path C:\Windows is invalid -- it should really be written as something like \??\C:\Windows. That's where this notation comes from.
The \??\ prefix means the path is not parsed. It is not guaranteed on every name, so you will have to look for the prefix on a per-name basis and skip it if present.
Update: I could not find any definitive documentation explaining exactly that \??\ actually represents, but here are some links that mention the \??\ prefix in action:
http://www.flexhex.com/docs/articles/hard-links.phtml
Note that szTarget string must contain the path prefixed with the "non-parsed" prefix "\??\", and terminated with the backslash character, for example "\??\C:\Some Dir\".
http://social.msdn.microsoft.com/Forums/en-US/vbgeneral/thread/908b3927-1ee9-4e03-9922-b4fd49fc51a6
http://mjunction.googlecode.com/svn-history/r5/trunk/MJunction/MJunction/JunctionPoint.cs
This prefix indicates to NTFS that the path is to be treated as a non-interpreted path in the virtual file system.
Private Const NonInterpretedPathPrefix As String = "\??\"
I have a filename declared like this;
filename = Time.now.strftime("%H:%M:%S")+'.json'
and the error occurs when I do this
File.open(filename,'w') do |f|
f.write(rsp)
end
Error is in `initialize' : Invalid argument - 18:28:20.json which I assume is beacuse of a filename. When I do some 'normal' name everything works OK, so any tips?
Try:
filename = Time.now.strftime("%H_%M_%S")+'.json'
Windows uses the colon as a drive letter separator;
see this SO question for other special chars.
Use a different separator. You might be able to escape it, but IMO, not really worth it.
FWIW, for timestamped filenames I tends towards yyyymmdd-hhmmss or similar anyway.
For things like files it's always good to include more-complete info in the question--that naming conventions are different across OSes is well-known.
I am trying to construct a regex to extract a domain given a url.
for:
http://www.abc.google.com/
http://abc.google.com/
https://www.abc.google.com/
http://abc.google.com/
should give:
abc.google.com
URI.parse('http://www.abc.google.com/').host
#=> "www.abc.google.com"
Not a regex, but probably more robust then anything we come up with here.
URI.parse('http://www.abc.google.com/').host.gsub(/^www\./, '')
If you want to remove the www. as well this will work without raising any errors if the www. is not there.
Don't know much about ruby but this regex pattern gives you the last 3 parts of the url excluding the trailing slash with a minumum of 2 characters per part.
([\w-]{2,}\.[\w-]{2,}\.[\w-]{2,})/$
you may be able to use the domain_name gem for this kind of work. From the README:
require "domain_name"
host = DomainName("a.b.example.co.uk")
host.domain #=> "example.co.uk"
Your question is a little bit vague. Can you give a precise specification of what it is exactly that you want to do? (Preferable with a testsuite.) Right now, all your question says is that you want a method that always returns 'abc.google.com'. That's easy:
def extract_domain
return 'abc.google.com'
end
But that's probably not what you meant …
Also, you say that you need a Regexp. Why? What's wrong with, for example, using the URI class? After all, parsing and manipulating URIs is exactly what it was made for!
require 'uri'
URI.parse('https://abc.google.com/').host # => 'abc.google.com'
And lastly, you say you are "trying to extract a domain", but you never specify what you mean by "domain". It looks you are sometimes meaning the FQDN and sometimes randomly dropping parts of the FQDN, but according to what rules? For example, for the FQDN abc.google.com, the domain name is google.com and the host name is abc, but you want it to return abc.google.com which is not just the domain name but the full FQDN. Why?