Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I am migrating a shop over for a client.
I have to pull all the old image files off her 'shop' which has no FTP access.
It allowed me to export a list of filenames/urls. My plan was to load them up in Firefox and use "Downloadthemall" to simply download all the files. (Around 2000). However about 1 1/3 have [ and ] in.
i.e.
cdn.crapshop.com/images/image[1].jpg
Downloadthemall freaks out and only reads it as
cdn.crapshop.com/images/image
And won't download it because it isn't a file.
Anyone got any ideas of an alternative way to pull a list like this?
See this solution that explains why the example URL you provided is invalid: Validation. After you look at that post you'll see that, in the answer provided by #good, you have to encode characters that are not according to the specification using percent encoding, so the webserver will understand them.
This calls for python... see this post: Percent encoding in python
And then we can put it all together in a script, which you will use to read from stdin and output to stdout: python script.py < input > output.out.
import urllib, sys
while 1:
try:
line = sys.stdin.readline()
except KeyboardInterrupt:
break
if not line:
break
print urllib.quote(line.strip(), safe=':').strip('\'')
Then, hopefully, download them all will parse that list of files (the input to that script is supposed to be a list of url's separated by a newline) that have been corrected by the script.
You may be interested in this post as well: Downloading files with python. Which shows you how to download files (web pages in particular) using python.
Good luck!
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I'm writing a bash script and I want to edit a PHP config file, find a string and replace by another.
The hard part is that I want this search/replace to be dynamic.
Here is an example:
define('APP_VERSION', '1.0.31');
The goal is to replace 1.0.31 by another version number.
How could I achieve that? I've tried with sed, but can't isolate the version number part (because it's not always the same, so I can't directly search for 1.0.31)
Thanks
The point of regexes is to match non-static text. To replace any version number with 123 use
sed "s/define('APP_VERSION', *'[^']*')/define('APP_VERSION', '123')/"
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I'm unable to read data from this file .. .Its a .dat file & i tried opening using notepad. But after opening I'm unable to read from it.The words are not in English . I tried changing the font ,but it didn't help. I even tried changing the format , nut still it was the same. Can anyone help me with this please ?
The file is shared over here:
https://docs.google.com/file/d/0BwISJR5GZQ88a29yTFZKTnJMYVU/edit?usp=sharing
This is not a text file; it is binary. It has the MIME type application/octet-stream.
This means you need to open it in whatever program it was created with.
You won't be able to read it, because it's not text data. Notepad will not help certainly.
Dat files are used by lots of programs and this file may be for one of those programs. I don't know why this file was shared. Maybe it was savefile of a game or setting of another program.
So in brief, you can't open it like text data and if you don't have any other information about this .DAT file, you should let it go.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I need to navigate to a certain directory and then execute a script located there.
I am using cd folder_name to navigate to the next directory.
One folder has a very long name (with white spaces). Is there a way to type only the first few letters and then use a shortcut key to autocomplete with the first matching name, or to navigate through possible matches?
The same if I want to perform a command on a certain file (e.g. chmod XXX file_name), is there a way to get the name to appear after I type a few letters of the filename?
The shell I am using is bash-3.2 in OS X 10.7.4.
Yes, Bash supports auto-completion (personally, it's one of my favorite features). Use the Tab key to complete what you've typed (note that it's case-sensitive). The Advanced Bash-Scripting Guide has a section on an introduction to programmable completion. You can enable completion to complete command names and more!
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I'm given a task of converting a bunch of code written in Python 2 to Python 3,
and this task was given with emphasis on having UTF-8 (didn't quite comprehend the concept but anyway..)
I've automated the conversion using 2to3, but not sure if using 2to3 achieves the goal of having UTF-8, or if there's some other parts that I should manually work on.
What is it exactly, and is it done automatically by using 2to3?
Thank you in advance.
"I was just told the importance of converting it into Python 3 due to importance of UTF-8 so that the program can work with any other language"
Whoever told you that was misinformed.
2to3 does not do anything towards "having UTF-8" whatever that means. 2to3 is to move your code from Python 2 to Python 3. Python 3 does mean you have have Unicode variable names, but I would strongly recommend against that anyway. Bad Idea. Otherwise Python 2 supports Unicode and UTF-8 perfectly well.
It seems your actual goal is not UTF-8, but translating the program to other language, also known as internationalization, or "18n". That's a completely different issue, and has nothing to do with 2to3. Instead you need to manually change all your text strings to gettext tokens that will be translated when rendered. See http://docs.python.org/library/gettext.html
See also http://regebro.wordpress.com/2011/03/23/unconfusing-unicode-what-is-unicode/ for more information on Unicode.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I'm writing a scraper program. I collect all the links on a page. They might be relative paths. For example:
foo.html
/foo.html
../foo.html
../../foo.html
I can concat them to the url of the page (basepath) they are on, but that isn't completely straightforward. For example:
http://www.example.com/foo + /bar.html = http://www.example.com/bar.html
http://www.example.com/bla/?foo=bar + ../foo.html = http://www.example.com/foo.html
I am wondering if there is an Erlang Lib, C Lib or a CLI program that can figure out the right concatenation for me?
As far as CLI goes, wget has the --base switch:
-B URL
--base=URL
Resolves relative links using URL as the point of reference, when reading links from an HTML file specified via the -i/--input-file option (together with --force-html, or when the input file was fetched remotely from a server describing it as HTML). This is equivalent to the presence of a "BASE" tag in the HTML input file, with URL as the value for the "href" attribute.
For instance, if you specify http://foo/bar/a.html for URL, and Wget reads ../baz/b.html from the input file, it would be resolved to http://foo/baz/b.html.
So if you exec'd it to output the file to stdout and read it with your erlang script, that should work.
You can use ex_uri:resolve/2.