Download whole folder from SourceForge - download

I need to download a project from SourceForge, but there is no easy visible way.
Here, on this picture (linked down, not enough reputation), it is possible to download "the latest version", which does include only files from first folder, but I need to download other folder.
It is possible to download these files, but only manually and because there are hundreds of files and subfolders - it would be quite impractical.
Does anyone know any way to download it? I didn't find much, only some mentioned wget, but I tried it without any success.
Link: http://s9.postimg.org/xk2upvbwv/example.jpg

In every Sourceforge project or project folders page there is an RSS link, as you can see in the example screenshot here.
right click that RSS icon in the page of the folder or project you want to download then copy the link and use the following Bash script:
curl "<URL>" | grep "<link>.*</link>" | sed 's|<link>||;s|</link>||' | while read url; do url=`echo $url | sed 's|/download$||'`; wget $url ; done
replace "<URL>" with your RSS link for example : "https://sourceforge.net/projects/xdxf/rss?path=/dicts-babylon/001", and watch the magic happens, The RSS link will include all the files of the Sourceforge folder or project and it's sub-folders, so the script will download everything recursively.
If the above script doesn't work, try this one that extracts the links from the HTML directly, Replace "<URL>" with the project's files URL example : "https://sourceforge.net/projects/synwrite-addons/files/Lexers/"
curl "<URL>" | tr '"' "\n" | grep "sourceforge.net/projects/.*/download" | sort | uniq | while read url; do url=`echo $url | sed 's|/download$||'`; wget $url ; done
Good luck

Sometimes there is a download link at the summary tab, but sometimes I don't know a work around so I use this piece of code:
var urls = document.getElementsByClassName('name')
var txt = ""
for (i = 0; i < urls.length; i++) {
txt += "wget " + urls[i].href +"\n"
}
alert(txt)
You should open a console in your browser on the page where all the files are listed. Copy+past+enter the code and you will be prompted a list of wget commands which you can Copy+past+enter in your terminal.

How to download all files from a particular folder in Sourceforge on Windows 10:
Step 1: Download the latest wget (zip file, not exe file) from https://eternallybored.org/misc/wget/.
Note: Search google for 'wget 1.20.x' to find proper link, if necessary. Download the 32-bit file, if your system is Winodws 10 32-bit or the 64-bit file, if your system is Windows 10 64-bit.
Step 2: Download the latest grep and coreutils installers from http://gnuwin32.sourceforge.net/packages.html.
Note: Search google for 'gnuwin32' to find proper link, if necessary. Only the 32-bit installers are available.
Step 3: Extract everything in the wget zip file downloaded to C:\WgetWinx32 or C:\WgetWinx64.
Note: You can install the wget virtually anywhere but preferably in a folder without space in the folder name.
Step 4: Install grep by double-clicking the respective installer to a folder, C:\GnuWin32.
Step 5: Install coreutils by double-clicking the respective installer to the same folder where the grep has been installed.
Note: You can install the grep and the coreutils in any order you want (i.e., first grep and then coreutils or vice versa) and virtually anywhere, even in the default location shown by the installer, but preferably in a folder without space in the folder name.
Step 6: Right-click on the 'This PC' icon in the desktop. Select the 'properties' menu from the drop-down list. Select the 'Advanced system settings' from the pop-up 'System' window.
Select the 'Environment Variables...' from the pop-up 'System properties' window. Select 'Path' from the 'System variables' and click on the 'Edit...' button.
Click on the 'New' button in the 'Edit environment variables' pop-up window. Enter the path for the wget installation folder (e.g., 'C:\WgetWin32' or 'C:\WgetWin64' without quotes).
Click on the 'New' button in the 'Edit environment variables' pop-up window again. Enter the path for the grep and coreutils installation folder (e.g., 'C:\GnuWin32\bin' without quotes).
Now, keep clicking on the 'Ok' buttons in the 'Edit environment variables', 'System variables' and 'System properties' pop-up windows.
Step 7: Create a DOS batch file, 'wgetcmd.bat' in the wget installation folder (e.g., 'C:\WgetWin32' or 'C:\WgetWin64' without quotes) with the following line using a text editor.
cd C:\WgetWin32
cmd
(OR)
cd C:\WgetWin64
cmd
Step 8: Create a shortcut to this batch file on the desktop.
Step 9: Right-click on the shortcut and select the 'Run as administrator' from the drop-down list.
Step 10: Enter the following commands either one by one or all at once in the DOS prompt in the pop-up DOS command window.
wget -w 1 -np -m -A download <link_to_sourceforge_folder>
grep -Rh refresh sourceforge.net | grep -o "https[^\\?]*" > urllist
wget -P <folder_where_you_want_files_to_be_downloaded> -i urllist
That's all folks! This will download all the files from the Sourceforge folder specified.

Example for the above: Suppose that I want to download all the files from the Soruceforge folder: https://sourceforge.net/projects/octave/files/Octave%20Forge%20Packages/Individual%20Package%20Releases/, then the following lines of commands will do this.
wget -w 1 -np -m -A download https://sourceforge.net/projects/octave/files/Octave%20Forge%20Packages/Individual%20Package%20Releases/
grep -Rh refresh sourceforge.net | grep -o "https[^\\?]*" > urllist
wget -P OctaveForgePackages -i urllist
The Sourceforge folder mentioned above contains a lot of Octave packages as .tar.gz files. All those files would be downloaded to the folder 'OctaveForgePackages' locally!

In case of no wget or shell install do it with FileZilla:
sftp://yourname#web.sourceforge.net
you open the connection with sftp and your password then
you browse to the /home/pfs/
after that path (could be a ? mark sign) you fill in with your folder path you want to download in remote site, in my case
/home/pfs/project/maxbox/Examples/
this is the access path of the frs:
File Release System: /home/frs/project/PROJECTNAME/

Here is a sample python script you could use to download from Sourceforge
r = requests.get("https://sourceforge.net/Files/filepath") #Path to folder with files
soup = BeautifulSoup(r.content,'html.parser')
dowload_dir ="download/directory/here"
files = [file_.a['href'] for file_ in soup.find_all('th' , headers = "files_name_h")]
for file_download_url in files:
filename = file_download_url.split('/')[-2]
if filename not in os.listdir(download_dir):
r = requests.get(file_download_url)
with open(os.path.join(download_dir, filename),"wb") as f:
f.write(r.content)
print(f"created file {os.path.join(download_dir, filename)}")

Related

How to download .txt file from box.com on mac terminal?

I do not have much experiance with the command line and I have done my research and still wasn't able to solve my problem.
I need to download a .txt file from a folder on box.com.
I attempted using:
$ curl -o FILE URL
However, all I got was a empty text file that was named random numbers. I assumed the reason this happened is because the url of the file location does not end in .txt since it is in a file on box.com.
I also attempted:
$ wget FILE URL
However, my mac terminal doesn't seem to find that command
Is there a different command that can download the file from box.com? Or am I missing something?
You need to put your URL in quotes to avoid shell trying to parse it:
curl -o myfile.txt "http://example.com/"
Update: If the URL requires authentication
Modern browsers allow you to export requests as curl commands.
For example, in Chrome, you can:
open your file URL in a new tab
open Developer tools (View -> Developer -> Developer Tools)
switch to Network tab in the tools
refresh the page, a request should appear in the "Network" tab
Right-click the request, choose "Copy -> Copy as cURL"
paste the command in the shell
Here's how it looks for this page for example:

How do I Convert a folder filled with .URL files to a .TXT in Mac OSX terminal or otherwise

URL files in a folder labeled Resources.
For all intents and purposes suppose it was my reading list URLs or my Favorites URLs.
Generally people sometimes accumulate a bunch of URL files.
The URL files naturally simply open a website.
If I would like to convert all those URLs to a .TXT or word file so that I can easily copy into my website and use it as links is that possible?
on microsoft they gave the following instructions to do while in CMD on windows. The closest thing to that in Mac did not seem to work for me:
http://support.microsoft.com/kb/183436
Click Start, point to Find, and then click Files Or Folders.
In the Named box, type *.url.
In the Look In box, type \favorites, where is the path to your Favorites folder. By default, the Favorites folder is located in the C:\Windows folder in Windows 95 and the C:\Winnt\Profiles\ folder in Windows NT.
Click Find Now.
On the Edit menu, click Select All.
On Edit menu, click Copy.
Click Start, point to Programs, and then click Windows Explorer or Windows NT Explorer.
Click drive C.
On the File menu, point to New, and then click Folder.
Type myurls, and then press ENTER.
Double-click the Myurls folder.
On the Edit menu, click Paste.
Click Start, point to Programs, and then click MS-DOS Prompt or Command Prompt.
Type the following commands at the command prompt, pressing ENTER after each line:
1. cd \
2. cd myurls
3. copy *.url url.txt
That is what Microsoft support said and as far as I understand correct me if I'm wrong it would take your favorites and convert them into a plain text file so you can click on each link at your leisure.
I have a similar application here.
These are the commands I used:
cd resources
cp *.url url.txt
I got the following message:
usage: cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file target_file
cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file ... target_directory
If they are standard Windows *.url files, you can do this:
grep ^URL * |sed 's/URL=//' > mylinks.txt
then just the website names will be in the file mylinks.txt.
Furthermore, you could add | sort | uniq if you would like your URL's in alphabetical order with duplicates removed. Like this:
grep ^URL *.url | sed 's/URL=//' | sort | uniq > mySortedLinks.txt
The closest UNIX equivalent to DOS copy when used like this is cat:
cat *.url > urls.txt

Mac zip compress without __MACOSX folder?

When I compress files with the built in zip compressor in Mac OSX, it causes an extra folder titled "__MACOSX" to be created in the extracted zip.
Can I adjust my settings to keep this folder from being created or do I need to purchase a third party compression tool?
UPDATE: I just found a freeware app for OSX that solves my problem: "YemuZip"
UPDATE 2: YemuZip is no longer freeware.
Can be fixed after the fact by zip -d filename.zip __MACOSX/\*
And, to also delete .DS_Store files: zip -d filename.zip \*/.DS_Store
When I had this problem I've done it from command line:
zip file.zip uncompressed
EDIT, after many downvotes: I was using this option for some time ago and I don't know where I learnt it, so I can't give you a better explanation. Chris Johnson's answer is correct, but I won't delete mine. As one comment says, it's more accurate to what OP is asking, as it compress without those files, instead of removing them from a compressed file. I find it easier to remember, too.
Inside the folder you want to be compressed, in terminal:
zip -r -X Archive.zip *
Where -X means: Exclude those invisible Mac resource files such as “_MACOSX” or “._Filename” and .ds store files
source
Note: Will only work for the folder and subsequent folder tree you are in and has to have the * wildcard.
This command did it for me:
zip -r Target.zip Source -x "*.DS_Store"
Target.zip is the zip file to create. Source is the source file/folder to zip up. The -x parameter specifies the file/folder to exclude.
If the above doesn't work for whatever reason, try this instead:
zip -r Target.zip Source -x "*.DS_Store" -x "__MACOSX"
I'm using this Automator Shell Script to fix it after.
It's showing up as contextual menu item (right clicking on any file showing up in Finder).
while read -r p; do
zip -d "$p" __MACOSX/\* || true
zip -d "$p" \*/.DS_Store || true
done
Create a new Service with Automator
Select "Files and Folders" in "Finder"
Add a "Shell Script Action"
zip -r "$destFileName.zip" "$srcFileName" -x "*/\__MACOSX" -x "*/\.*"
-x "*/\__MACOSX": ignore __MACOSX as you mention.
-x "*/\.*": ignore any hidden file, such as .DS_Store .
Quote the variable to avoid file if it's named with SPACE.
Also, you can build Automator Service to make it easily to use in Finder.
Check link below to see detail if you need.
Github
The unwanted folders can be also be deleted by the following way:
zip -d filename.zip "__MACOSX*"
Works best for me
The zip command line utility never creates a __MACOSX directory, so you can just run a command like this:
zip directory.zip -x \*.DS_Store -r directory
In the output below, a.zip which I created with the zip command line utility does not contain a __MACOSX directory, but a 2.zip which I created from Finder does.
$ touch a
$ xattr -w somekey somevalue a
$ zip a.zip a
adding: a (stored 0%)
$ unzip -l a.zip
Archive: a.zip
Length Date Time Name
-------- ---- ---- ----
0 01-02-16 20:29 a
-------- -------
0 1 file
$ unzip -l a\ 2.zip # I created `a 2.zip` from Finder before this
Archive: a 2.zip
Length Date Time Name
-------- ---- ---- ----
0 01-02-16 20:29 a
0 01-02-16 20:31 __MACOSX/
149 01-02-16 20:29 __MACOSX/._a
-------- -------
149 3 files
-x .DS_Store does not exclude .DS_Store files inside directories but -x \*.DS_Store does.
The top level file of a zip archive with multiple files should usually be a single directory, because if it is not, some unarchiving utilites (like unzip and 7z, but not Archive Utility, The Unarchiver, unar, or dtrx) do not create a containing directory for the files when the archive is extracted, which often makes the files difficult to find, and if multiple archives like that are extracted at the same time, it can be difficult to tell which files belong to which archive.
Archive Utility only creates a __MACOSX directory when you create an archive where at least one file contains metadata such as extended attributes, file flags, or a resource fork. The __MACOSX directory contains AppleDouble files whose filename starts with ._ that are used to store OS X-specific metadata. The zip command line utility discards metadata such as extended attributes, file flags, and resource forks, which also means that metadata such as tags is lost, and that aliases stop working, because the information in an alias file is stored in a resource fork.
Normally you can just discard the OS X-specific metadata, but to see what metadata files contain, you can use xattr -l. xattr also includes resource forks and file flags, because even though they are not actually stored as extended attributes, they can be accessed through the extended attributes interface. Both Archive Utility and the zip command line utility discard ACLs.
You can't.
But what you can do is delete those unwanted folders after zipping. Command line zip takes different arguments where one, the -d, is for deleting contents based on a regex. So you can use it like this:
zip -d filename.zip __MACOSX/\*
Cleanup .zip from .DS_Store and __MACOSX, including subfolders:
zip -d archive.zip '__MACOSX/*' '*/__MACOSX/*' .DS_Store '*/.DS_Store'
Walkthrough:
Create .zip as usual by right-clicking on the file (or folder) and selecting "Compress ..."
Open Terminal app (search Terminal in Spotlight search)
Type zip in the Terminal (but don't hit enter)
Drag .zip to the Terminal so it converts to the path
Copy paste -d '__MACOSX/*' '*/__MACOSX/*' .DS_Store '*/.DS_Store'
Hit enter
Use zipinfo archive.zip to list files inside, to check (optional)
I have a better solution after read all of the existed answers. Everything could done by a workflow in a single right click.
NO additional software, NO complicated command line stuffs and NO shell tricks.
The automator workflow:
Input: files or folders from any application.
Step 1: Create Archive, the system builtin with default parameters.
Step 2: Run Shell command, with input as parameters. Copy command below.
zip -d "$#" "__MACOSX/*" || true
zip -d "$#" "*/.DS_Store" || true
Save it and we are done! Just right click folder or bulk of files and choose workflow from services menu. Archive with no metadata will be created alongside.
IMAGE UPDATE: I chose "Quick Action" when creating a new workflow - here’s an English version of the screenshot:
do not zip any hidden file:
zip newzipname filename.any -x "\.*"
with this question, it should be like:
zip newzipname filename.any -x "\__MACOSX"
It must be said, though, zip command runs in terminal just compressing the file, it does not compress any others. So do this the result is the same:
zip newzipname filename.any
Keka does this. Just drag your directory over the app screen.
Do you mean the zip command-line tool or the Finder's Compress command?
For zip, you can try the --data-fork option. If that doesn't do it, you might try --no-extra, although that seems to ignore other file metadata that might be valuable, like uid/gid and file times.
For the Finder's Compress command, I don't believe there are any options to control its behavior. It's for the simple case.
The other tool, and maybe the one that the Finder actually uses under the hood, is ditto. With the -c -k options, it creates zip archives. With this tool, you can experiment with --norsrc, --noextattr, --noqtn, --noacl and/or simply leave off the --sequesterRsrc option (which, according to the man page, may be responsible for the __MACOSX subdirectory). Although, perhaps the absence of --sequesterRsrc simply means to use AppleDouble format, which would create ._ files all over the place instead of one __MACOSX directory.
This is how i avoid the __MACOSX directory when compress files with tar command:
$ cd dir-you-want-to-archive
$ find . | xargs xattr -l # <- list all files with special xattr attributes
...
./conf/clamav: com.apple.quarantine: 0083;5a9018b1;Safari;9DCAFF33-C7F5-4848-9A87-5E061E5E2D55
./conf/global: com.apple.quarantine: 0083;5a9018b1;Safari;9DCAFF33-C7F5-4848-9A87-5E061E5E2D55
./conf/web_server: com.apple.quarantine: 0083;5a9018b1;Safari;9DCAFF33-C7F5-4848-9A87-5E061E5E2D55
Delete the attribute first:
find . | xargs xattr -d com.apple.quarantine
Run find . | xargs xattr -l again, make sure no any file has the xattr attribute. then you're good to go:
tar cjvf file.tar.bz2 dir
Another shell script that could be used with the Automator tool (see also benedikt's answer on how to create the script) is:
while read -r f; do
d="$(dirname "$f")"
n="$(basename "$f")"
cd "$d"
zip "$n.zip" -x \*.DS_Store -r "$n"
done
The difference here is that this code directly compresses selected folders without macOS specific files (and not first compressing and afterwards deleting).

SVN ignore problem in OS X Lion

Before installing Lion, when I tried to ignore something on my svn, I just typed the following command:
svn propedit svn:ignore .
This opened a temporary file for the current directory with the selected editor and I could write there my patterns, which where ignored by svn.
After I installed Lion, when I type this command the following error appears: The document “svn-prop.tmp” could not be opened. The file doesn’t exist.
Did anybody else met this error before? (I tried googling, but I didn't find any solution).
SVN_EDITOR=/Applications/TextEdit.app/Contents/MacOS/TextEdit
It seems that with Lion it is no more possible to open a file with TextEdit on the command line giving the file name as argument.
A workaround is to use open
export SVN_EDITOR='open -e -W -n '
-e tells to open with TextEdit (use -a if you want to specify a different application)
-W tells open to wait for TextEdit to quit. If not specified svn propedit will read the file before it's edited and return telling the no changes were done.
-n tells to open a new instance of TextEdit even if there is another one already open. On one hand it will avoid that you have to quit an open editor and on the other hand it was not working without the option :-)

Untar multipart tarball on Windows

I have a series of files named filename.part0.tar, filename.part1.tar, … filename.part8.tar.
I guess tar can create multiple volumes when archiving, but I can't seem to find a way to unarchive them on Windows. I've tried to untar them using 7zip (GUI & commandline), WinRAR, tar114 (which doesn't run on 64-bit Windows), WinZip, and ZenTar (a little utility I found).
All programs run through the part0 file, extracting 3 rar files, then quit reporting an error. None of the other part files are recognized as .tar, .rar, .zip, or .gz.
I've tried concatenating them using the DOS copy command, but that doesn't work, possibly because part0 thru part6 and part8 are each 100Mb, while part7 is 53Mb and therefore likely the last part. I've tried several different logical orders for the files in concatenation, but no joy.
Other than installing Linux, finding a live distro, or tracking down the guy who left these files for me, how can I untar these files?
Install 7-zip. Right click on the first tar. In the context menu, go to "7zip -> Extract Here".
Works like a charm, no command-line kung-fu needed:)
EDIT:
I only now noticed that you mention already having tried 7zip. It might have balked if you tried to "open" the tar by going "open with" -> 7zip - Their command-line for opening files is a little unorthodox, so you have to associate via 7zip instead of via the file association system built-in to windows. If you try the right click -> "7-zip" -> "extract here", though, that should work- I tested the solution myself (albeit on a 32-bit Windows box- Don't have a 64 available)
1) download gzip http://www.gzip.org/ for windows and unpack it
2) gzip -c filename.part0.tar > foo.gz
gzip -c filename.part1.tar >> foo.gz
...
gzip -c filename.part8.tar >> foo.gz
3) unpack foo.gz
worked for me
As above, I had the same issue and ran into this old thread. For me it was a severe case of RTFM when installing a Siebel VM . These instructions were straight from the manual:
cat \
OVM_EL5U3_X86_ORACLE11G_SIEBEL811ENU_SIA21111_PVM.tgz.1of3 \
OVM_EL5U3_X86_ORACLE11G_SIEBEL811ENU_SIA21111_PVM.tgz.2of3 \
OVM_EL5U3_X86_ORACLE11G_SIEBEL811ENU_SIA21111_PVM.tgz.3of3 \
| tar xzf –
Worked for me!
The tar -M switch should it for you on windows (I'm using tar.exe).
tar --help says:
-M, --multi-volume create/list/extract multi-volume archive
I found this thread because I had the same problem with these files. Yes, the same exact files you have. Here's the correct order: 042358617 (i.e. start with part0, then part4, etc.)
Concatenate in that order and you'll get a tarball you can unarchive. (I'm not on Windows, so I can't advise on what app to use.) Note that of the 19 items contained therein, 3 are zip files that some unarchive utilities will report as being corrupted. Other apps will allow you to extract 99% of their contents. Again, I'm not on Windows, so you'll have to experiment for yourself.
Enjoy! ;)
This works well for me with multivolume tar archives (numbered .tar.1, .tar.2 and so on) and even allows to --list or --get specific folders or files in them:
#!/bin/bash
TAR=/usr/bin/tar
ARCHIVE=bkup-01Jun
RPATH=home/user
RDEST=restore/
EXCLUDE=.*
mkdir -p $RDEST
$TAR vf $ARCHIVE.tar.1 -F 'echo '$ARCHIVE'.tar.${TAR_VOLUME} >&${TAR_FD}' -C $RDEST --get $RPATH --exclude "$EXCLUDE"
Copy to a script file, then just change the parameters:
TAR=location of tar binary
ARCHIVE=Archive base name (without .tar.multivolumenumber)
RPATH=path to restore (leave empty for full restore)
RDEST=restore destination folder (relative or absolute path)
EXCLUDE=files to exclude (with pattern matching)
Interesting thing for me is you really DON'T use the -M option, as this would only ask you questions (insert next volume etc.)
Hello perhaps would help.
I had the same problems ...
a save on my web site made automaticaly in Centos at 4 am create multiple file in multivolume tar format (saveblabla.tar, saveblabla.tar1.tar, saveblabla.tar2.tar,etc..)
after downloading this file on my PC (windows) i can't extract them with both windows cmd or 7zip (unknow error).
I thirst binary copy file to reassemble tar files. (above in that thread)
copy /b file1+file2+file3 destination
after that, 7zip worked !!! Thanks for you help

Resources