How to download model from huggingface? - huggingface-transformers

https://huggingface.co/models
For example, I want to download 'bert-base-uncased', but cann't find a 'Download' link. Please help. Or is it not downloadable?

Accepted answer is good, but writing code to download model is not always convenient. It seems git works fine with getting models from huggingface. Here is an example:
git lfs clone https://huggingface.co/sberbank-ai/ruT5-base
where 'lfs' stays for 'large file storage'. Technically this command is deprecated and simple 'git clone' should work, but then you need to setup filters to not skip large files (How do I clone a repository that includes Git LFS files?)

The models are automatically cached locally when you first use it.
So, to download a model, all you have to do is run the code that is provided in the model card (I chose the corresponding model card for bert-base-uncased).
At the top right of the page you can find a button called "Use in Transformers", which even gives you the sample code, showing you how to use it in Python. Again, for bert-base-uncased, this gives you the following code snippet:
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModelForMaskedLM.from_pretrained("bert-base-uncased")
When you run this code for the first time, you will see a download bar appear on screen. See this post (disclaimer: I gave one of the answers) if you want to find the actual folder where Huggingface stores their models.

I aggre with Jahjajaka's answer. In addition, you can find the git url by clicking the button called "Use in Transformers", shown in the picture.

I typically see if the model has a GitHub repo where I can download the zip file. Due to my company protocols I often cannot directly connect to some sources without getting an SSL certificate error, but I can download from GitHub.

How about using hf_hub_download from huggingface_hub library?
hf_hub_download returns the local path where the model was downloaded so you could hook this one liner with another shell command.
python3 -c 'from huggingface_hub import hf_hub_download; downloaded_model_path = hf_hub_download(
repo_id="CompVis/stable-diffusion-v-1-4-original",
filename="sd-v1-4.ckpt",
use_auth_token=True
); print(downloaded_model_path)'

Related

How to use d3's zoomable-sunburst

I'm trying to use zoomable suburst to display some data. I've generated the json file and am able to use the site to display my data. Now I'm trying to do this on my local machine, but am not sure of the correct way to go about this.
I think there are a couple of ways of going about this. One would be to just dump the js code into a js file and import it into an html file. I've seen some implementations on github give me the ability to do this, but they are not as clean as the one i've found on observablehq. And I'm unable to get the one on observablehq to work locally doing a copy/paste.
I also see an option on observablehq where i can download the code. I did that and the readme that came with it says that i need to run it on a server (ex. python -m http.server), but when i run the server from the folder containing the downloaded code, i keep getting a bunch of
code 404, message File not found
Now I'm a bit confused. I'd like to know the "right" way to go about using zoomable sunburst to show my data, and if it's at all possible to run this on my local.
Any suggestions/advice would be great. Thanks.
I'm super late to the party, but here is what was the problem for me :
I was running python -m http.server in the wrong directory, i.e the directory that didn't have index.html file inside. on I ran it in the directory that had the index.html file it worked perfectly.
hope this helps someone!

Ruby / Git library. How to get a full file of a particular check in?

Am working on a script based on git repository. Using ruby's git library.
Having trouble to find the feature to load the full file of a history check in. In git the content can be shown like:
git show 234h23h4j23l4j:path/to/file.java
Just need to know in ruby / git, how do I do that?
Note that this commit (234h23h4j23l4j) does not necessarily have the file I'm looking at.
Or if you know any other git library can easily do this please also recommend. We can still switch, it's not too late.
You can try something like
commit = g.gcommit('1cc8667014381') #to get reference to some commit.
and then explore the commit object you get. (I found some documentation here.)

Is there a specific protocol to add everything to Git using Rugged?

I recently began using Rugged, and have already run into some problems. Basically all I want to do is add everything in a Git repo, commit the staged changes, and push everything to a branch. I've started out with the first step as follows:
#repo = Rugged::Repository.new(Dir.pwd)
#index = #repo.index
def git_add
#index.add mode: 'add-all'
end
But the console ends up screaming at me. I browsed through libgit2's documentation, and couldn't find any examples of adding everything in repo. Some thorough Googling yielded similar results. I could probably have just jammed in a #repo.workdir.entries as the path parameter for index.add, but I'm not sure. Is there a better way to go about this?
Depending on whether you want to stage every file or just the ones which are already in, you have two options, Index#add_all and Index#update_all respectively.
You can use repo.index.add_all() to stage every file under the specified directory. You can use repo.index.update_all() to do the same but only for those files which are already known to the repository, similarly to git's -A and -u options.

How do I use Github to access the same project files from different computers?

I work mainly on a desktop Mac but also have a laptop Mac that I use when away from the office.
I want to access and work on my latest html, css, php and python files from either computer.
I thought Github was the way to do this but am having a problem understanding the "flow" and I've RTFM! I don't understand whether I should create a Repository on Github first, why when I try to "clone" something it doesn't magically end up on my local computer... where the nice big red button that says "sync" is...
... or whether I should just use the commandline ONLY...
So, if I start on my desktop and create new files, what are the correct steps using git or Github (?) to put those files where they can then be accessed from my laptop and then have the files on my laptop merged back into the ?Github repository so I can then access those files from my desktop.
Thank you all for your replies and answers! The git workflow, for my needs, is now clear.
The workflow presented by wadesworld is concise and was the overview I needed.
However, Michael Durrant's commandline steps filled in that workflow specifically with commandline directives - and I needed that also.
steelclaw and uDaY's answers and responses were important because I did not understand that it did not matter which repo I created first and, adding and committing locally were essential first steps in my workflow.
Specifically, steelclaw's response to one of my response questions provided the closure I needed, so I could learn more:
After initializing the repository, be sure to use 'add' and 'commit.' These will make the files an official version of the repository. After that, you need to use 'push' to upload it to the remote repository."
ilollar's resource, "Git for Ages 4 and Up" is also worthy of the click, especially for folks like me who are visual!
Thank you all so very much!!
Do you want to version control your files or just have access to the same files in both places?
It is a good idea to use version control as a developer, whether you're writing code or designing websites. But, to do so, you have to have a commitment to learning how version control systems work, since they all have some learning curve.
But, if you're not interested in that complexity and simply want to be sure you have access to the latest version of your files, then you're looking at a file syncing operation which can be much more simple.
So, which one do you want?
Edit: Based on the response, here's the model:
1) Create repository on work computer.
2) Create repository with same name on github.
3) Push to repository on github
4) At home, do a git clone to pull down the changes you pushed.
5) Now that the repository exists in both locations, you can simply do a git push before you leave work, and git pull when you get home, and vice-versa when going the other direction.
To answer the detail of your question: I'd go with Dropbox.
UbuntuOne is also good even for non Ubuntu users and of course Google drive is the (big) new player on the block.
They compare as follows:
Service Free*1 NextLevel*1 NextLevel($)*2 Features
Dropbox 2 50 $2.5O One Folder, best gui sync tools.
UbuntuOne 5 20 $4.00 Multiple directories anywhere
GDrive 5 25 $2.50 It's Google.
*1 GB
*2 Cost per month
To answer the title of your question:
If you wanted something that's more suited to programmers, I'd use git:
First, install gitx (linux readers, that's gitg) as that is by far the most popular gui for git:
For the "flow" I can also refer you to my write-up of various features at:
What are the core concepts of git, github, fork & branch. How does git compare to SVN?
Using gitx or gitg the specific flow is as follow:
1) Make some changes to files.
2) Use the tools "commit" tab to see what's changed ("unstaged"):
3) Add a file by dragging it from "unstaged" to "staged":
4) Give a commit message
5) Commit the file.
6) I then push it to the remote at the command line with $ git push remote or I use the gui by right clicking and select ing the 2nd master - see here:
.
If I'm sharing with others I'll often need to do git pull to get ands merge in others chnages) before being able to do a git push
The github part is doing init and push and clone but I'd say just read up on those tutorials more rather than an SO question. Basically though, I do:
Set up repository locally in git:
git init
git add .
git commit "Initial commit"
Set up github:
Create a github repository using github (https://help.github.com/articles/create-a-repo)and then push your local repository to it as in:
git push origin master.
If the repository already exists on github but not on your local pc, that's wheh you click the remote link and then in a terminal type git clone [paste here, e.g. ctrl-v]
If you're "starting" with github:
Make code changes
git pull - get latest version into your repository and merge in any changes
git add . Add all modified files
git commit -m "message"
git push # origin master is the default.
If, at the end of the day you decide to go with something simple like Dropbox you can use my referral link -http://db.tt/pZrz4t3k- to get a little more than the standard 2GB, Using this we both get an extra 0.5 GB, however which of all these routes to go is up to you and your needs. I use all these services (git, github, UbuntuOne, Dropbox and googleDrive, so I am not recommending one over the others -it depends on the needs).
I would recommend using DropBox or Google Drive. They will let you do EXACTLY what you are trying to achieve, they are very user friendly (and free [5 Gb I think]).
They automatically update (as long as you have an internet connection obviously)
Just make a folder, put some files in it, and you are away.
Since explaining how to use an entire VCS in one answer is an overwhelming task, I can instead point you in the direction of some very helpful resources to get you to understanding and using Git:
Pro Git - a free online book (written with Git!) with easy language on all things Git.
GitHub Help - GitHub's own help section walks you through setting up and using Git, and not just with their own apps. Very useful.
Get Started with Git - A good tutorial getting you up and running with Git.
Git For Ages 4 and Up - Fantastic video explaining the inner-workings of Git with Tinker Toys. Not best for an introduction into Git, but a great video to watch once you feel a bit more comfortable.
Git may feel complicated or strange at first, but if what you are looking for is a good version control system, it is excellent.
However, if all you're looking for is a cloud-like service to sync some files across multiple computers, like the others have mentioned, Dropbox would be the way to go.
I use Github as a "hub" of git, to share finished codes. (And Git for version control)
And Dropbox to sync files between different computers and mobile/tablet, to manage files.
http://db.tt/EuXOgGQ
They serve different purposes for me. Both are good!
Git is an advanced and rather difficult tool to use for version control. If you're feeling brave, you can try to install the command line tool, however I recommend using a graphical client, specifically SourceTree.
http://www.atlassian.com/software/sourcetree/overview
You'll need to clone your repository, or else initialize a new one. To connect to your repository, you'll need to know the URL, and possibly a username and password for your repository. You also need to provide a valid name for the repository.
To update files there are several steps: First, you need to add the changes to the directory. Source tree might do this automatically. Then you need to commit the changes. This is basically confirming changes and signing them with a comment. To upload them, you need to use push and select the correct remote repository. When you want to update your local repository, you'll need to use pull and again select the correct remote repository.
For your purposes, however, it seems like dropbox might be better, because it automatically updates and is very simple. If you don't need the advanced version control that git provides (e.g. branching, merging from many users), then it seems like it would be a better option for you.
https://www.dropbox.com/

Getting Mercurial in-process hook to run on Windows

I'm trying to get a Mercurial in-process hook to run on Windows.
The problem is not how to write the hook (I want to use an existing one, in this case BugTracker.Net's hook for Mercurial integration - I didn't find a direct link to the file, but you can see it if you download BT.net here, it's in the "mercurial" subfolder).
The problem is how to tell Mercurial to run it.
I spent quite some time to read the documentation, but I'm stuck right now.
(it would probably be easier with a certain knowledge of Python - which I don't have)
I know that I have to insert a line in the hgrc file (in the .hg folder of my repository).
There's an example in the HG Book which looks like this:
[hooks]
commit.example = python:mymodule.submodule.myhook
And there's another example on the Mercurial site, it looks like this:
[hooks]
changegroup = /path/to/changegrouphook
Now I want a "incoming" hook, so at least I know I have to do this:
[hooks]
incoming.btnet = X
The problem is to figure out "X".
The filename is hg_hook_for_btnet.py and in the file, there is a line which looks like this:
def debug_out(s):
I suppose that's the name of the "function" itself.
So my line needs to look something like this:
[hooks]
incoming.btnet = python:hg_hook_for_btnet.debug_out
But this gives me an error message [Errno 2] No such file or directory when I push.
I already tried lots of different variations, but it doesn't work and I don't know what I'm doing wrong.
Do I need python: at the beginning or not?
Do I need to specify the file extension .py or not?
Do I need /path/to/... as indicated in the example from the Mercurial site (see above)?
If yes, what is the correct syntax for the path? (just c:\MyRepo\ doesn't work - syntax must be different in Python)
Also, did I put the hook file into the correct folder?
Right now, it is in the main folder of my repository (on the same level as the .hg folder).
EDIT:
Martin, I changed it into this:
[hooks]
incoming.btnet = python:~c:\HG\MyRepo\hg_hook_for_btnet.py:debug_out
Now I get a different message: [Errno 22] Invalid argument
I suppose this is because of the repo and ui arguments you mentioned.
So, does this mean that the hook script is broken?
(as I said - I don't know anything about Python, this is an existing hook script from an open source bugtracker)
EDIT 2:
Sorry for the confusion regarding in-process and separate process - I know there is a difference, but I assumed that if the hook is written in Python, it must be in-process automatically (turns out I was wrong :-)
Okay, with the syntax in your edited answer, the script at least runs.
I have Python 2.7 installed (already did that before I asked the question here) and changed the first line in the script into #!C:\Python27\python.exe.
Now I get this:
running hook incoming.btnet: c:\HG\MyRepo\hg_hook_for_btnet.py
warning: incoming.btnet hook exited with status 1
So the script runs, but there is still some error.
This seems to be a Bugtracker.NET related problem, so I will ask on the BT.NET mailing list for further advice.
Thank you for your help though, without you I probably wouldn't even have come so far!
You should use
[hooks]
incoming.btnet = python:~/path/to/hg_hook_for_btnet.py:debug_out
and define debug_out as
def debug_out(ui, repo, **kwargs):
# ...
as explained in the HG book -- all hooks are called with a ui and a repo argument plus some extra hook-specific arguments. The Mercurial API page explains what you can do with the ui and a repo arguments.
Edit: Aha... I've now looked at the script. It is not designed to be run as an in-process Mercurial hook. It is instead designed to be run as a separate process. So you will need to use
[hooks]
incoming.btnet = c:\HG\MyRepo\hg_hook_for_btnet.py
and make sure you follow the instructions in the script: it talks about setting the path to the hg.exe binary and to your Python interpreter. The latter means that the author expects you to install Python. There is an email address in the script -- I suggest you contact corey Trager directly or via a BugTracker.NET mailinglist. Since it's a bug tracker, I assume they have a proper place where you can report this! :-)

Resources