Scrape JSON response with Selenium Browser [closed] - ajax

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I want to make the following actions automatically:
I open a web page with Google Chrome.
wait for it to render all the needed information.
go to Inspect Element, tab Network, and look at XHR requests.
find the file that I need.
copy the content of its response (to save it in a txt file).
It's kind of web scraping, but with less effort (how I think).
The problem is that I can't find what tools allow me to do that.
I started with Python and Selenium (chrome driver). But didn't found any info, is it possible to get XHR responses or not. All the tutorials are about scraping HTML. It seems logic to be possible, but my research didn't help.
Any idea?
Thank you.

The website you are trying to scrape has a dynamically generated content by JavaScript .
You have two options to work your way around that
Simulate a human browser interaction using selenium and open the website then wait till all the content is rendered and then use selenium to Extract the data you seek . this approach deals with the Elements tab. you just use css or xpath selectors to get the tags you want
instead of finding a way to make selenium go to network tab and save the content ( which you will find extremely hard to do ) you should get the URL of the XHR request and build the same request with the same headers and parameters if any exists and then use requests to send that request and you can save the content easily .
Let's try to scrape Home | Microsoft Academic
First approach :
from selenium import webdriver
driver = webdriver.Chrome() # Launch the browser
driver.get("https://academic.microsoft.com/home") # Go to the given url
authors = driver.find_elements_by_xpath('//a[#data-appinsights-action="TopAuthorSelected"]') # get the elements using selectors
for author in authors: # loop through them
print(author.text)
Output :
1. Yoshua Bengio
2. Geoffrey E. Hinton
3. Andrew Zisserman
4. Ilya Sutskever
5. Jian Sun
6. Trevor Darrell
7. Scott Shenker
8. Jiawei Han
9. Kaiming He
10. Ross Girshick
11. Ion Stoica
12. Hari Balakrishnan
13. R Core Team
14. Jitendra Malik
15. Jeffrey Dean
Second approach :
import requests
res = requests.get('https://academic.microsoft.com/api/analytics/authors/topauthors?topicId=41008148&take=15&filter=1&dateRange=1').json()
#The XHR Response is Usually in Json format
#res = [{'name': 'Yoshua Bengio', 'id': '161269817', 'lat': 0.0, 'lon': 0.0}, {'name': 'Geoffrey E. Hinton', 'id': '563069026', 'lat': 0.0, 'lon': 0.0}, {'name': 'Andrew Zisserman', 'id': '2469405535', 'lat': 0.0, 'lon': 0.0}, {'name': 'Ilya Sutskever', 'id': '215131072', 'lat': 0.0, 'lon': 0.0}, {'name': 'Jian Sun', 'id': '2200192130', 'lat': 0.0, 'lon': 0.0}, {'name': 'Trevor Darrell', 'id': '2174985400', 'lat': 0.0, 'lon': 0.0}, {'name': 'Scott Shenker', 'id': '719828399', 'lat': 0.0, 'lon': 0.0}, {'name': 'Jiawei Han', 'id': '2121939561', 'lat': 0.0, 'lon': 0.0}, {'name': 'Kaiming He', 'id': '2164292938', 'lat': 0.0, 'lon': 0.0}, {'name': 'Ross Girshick', 'id': '2473549963', 'lat': 0.0, 'lon': 0.0}, {'name': 'Ion Stoica', 'id': '2161479384', 'lat': 0.0, 'lon': 0.0}, {'name': 'Hari Balakrishnan', 'id': '1998464616', 'lat': 0.0, 'lon': 0.0}, {'name': 'R Core Team', 'id': '2976715238', 'lat': 0.0, 'lon': 0.0}, {'name': 'Jitendra Malik', 'id': '2136556746', 'lat': 0.0, 'lon': 0.0}, {'name': 'Jeffrey Dean', 'id': '2429370538', 'lat': 0.0, 'lon': 0.0}]
for author in res:
print(author['name'])
Output:
Yoshua Bengio
Geoffrey E. Hinton
Andrew Zisserman
Ilya Sutskever
Jian Sun
Trevor Darrell
Scott Shenker
Jiawei Han
Kaiming He
Ross Girshick
Ion Stoica
Hari Balakrishnan
R Core Team
Jitendra Malik
Jeffrey Dean
Second approach saves time , resources and straight forward .
Using First approach Image
Using Second approach Image

BrowserMob proxy (https://github.com/lightbody/browsermob-proxy) will help you with this. It will capture all requests, and when configured, their responses.
See this previous answer for more details: Running Selenium Webdriver with a proxy in Python

Related

Can't export my image from Google Earth Engine

I'm aware I'm far from the first person to ask this question, but I found myself likewise perplexed at how to export an image from this godforsaken program, and after trying to parse through a lot of JavaScript that might as well be Greek to me, I've decided I'm in need of an answer specific to my own plight. Below is my code:
var aer = ee.ImageCollection('USDA/NAIP/DOQQ')
.filter(ee.Filter.date('2020-02-01', '2020-09-04'));
var trueColor = aer.select(["R", 'G', 'B']);
var trueColorVis = {
min: 0.0,
max: 255.0,
};
var visual = trueColor.visualize({
bands: ['R', 'B', 'G'],
min: 0,
max: 255.0
})
Map.setCenter(-97.43, 42.03, 13)
Map.addLayer(trueColor, trueColorVis, 'True Color');
var TrueColorvis = trueColor.visualize({
bands: ['R', 'G', 'B'],
max: 0.4
});
// Create a task that you can launch from the Tasks tab.
Export.image.toDrive({
image: trueColorvis,
description: "Aerial view of Norfolk",
scale: 30
region: geometry
});
With 'geometry' being the desired area that I highlighted. I'm constantly faced with the error message:
Line 11: trueColor.visualize is not a function
And while I'm vaguely aware that this is because my image isn't a 'function,' I'm not sure how to make it one. I'm pressed for time on a final project for a class and really just need to move on to the next step, I really didn't expect this to be the part that sucked away multiple hours of my time.
I'm obviously pretty inexperienced with this program, so if an answer can be provided in a manner that doesn't rely on me having a solid depth of knowledge about this program, that would make a world of difference (exact reason I wasn't able to learn anything from answers to similar questions asked). Thanks!
The problem here is that trueColor is an ee.ImageCollection (many images), not ee.Image (one image). Only images have visualize().
To fix this, and do the same thing that Map.addLayer() does, use .mosaic() to combine the images:
var TrueColorvis = trueColor.mosaic().visualize({
...

Plotly express boxplot image export - part of box colors missing in svg file

I have made a boxplot using plotly express (px.box), and the resulting plot looks good in my Google Chrome / Jupyter browser window. Here is a schreenshot of two randomly selected boxes and they look as I expect.
However, after exporting using pio.write_image, it looks like this (zoomed in):
WHY does it not fill up the whole box after export? What can I do to avoid it? I have tried defining width and height as "size*300" to set the DPI to 300, I have tried with and without "scale" and I have tried to use Orca as image export engine, tried export as .PDF, and updated Plotly (Plotly vers. 5.1.0). Links to comprehensive guides for export of high-quality plotly plots for use as figures in scientific papers also much appreciated, as exporting quality very often not satifying.
A example of the problem can be reproduced with this:
import plotly.express as px
import plotly.io as pio
import pandas as pd import plotly.graph_objs as go import sys import plotly
pio.templates.default = “simple_white”
x = [‘Cat1’, ‘Cat1’,‘Cat2’, ‘Cat2’, ‘Cat3’, ‘Cat3’, ‘Cat4’, ‘Cat4’,‘Cat5’, ‘Cat5’, ‘Cat6’, ‘Cat6’, ‘Cat6’, ‘Cat7’, ‘Cat7’, ‘Cat8’, ‘Cat8’,‘Cat11’, ‘Cat11’, ‘Cat12’,‘Cat12’, ‘Cat10’, ‘Cat10’, ‘Cat9’, ‘Cat9’, ‘Cat13’, ‘Cat13’, ‘Cat14’, ‘Cat14’, ‘Cat15’, ‘Cat15’, ‘Cat16’,‘Cat16’, ‘Cat17’]
y = [0.0, 0.0, 0.0, 0.0047, 0.0, 0.036, 0.0, 0.0, 0.12314, 0.02472495, 0.004,0.0, 0.013, 0.0, 0.0, 0.184, 0.056, 0.0186, 0.005928, 0.340, 0.20335, 0.0, 0.0, 0.2481, 0.12, 0.0, 0.0, 0.0201, 0.050, 0.0,0.0, 0.041, 0.0199, 0.0]
data = { “x”: x, “y”: y, }
df = pd.DataFrame(data)
box_plot = px.box(df, x=“x”, y=“y”, points=“all”, width=800, height=400) box_plot.update_yaxes(title=“Random numbers”, title_font=dict(size=18, family=‘Arial’), tickfont=dict(family=‘Arial’, size=18)) box_plot.update_xaxes(title=None, tickangle=45, title_font=dict(size=18, family=‘Arial’), tickfont=dict(family=‘Arial’, size=18), categoryorder=‘array’, categoryarray=[“Cat2”, “Cat1”, “Cat3”,“Cat4”, “Cat5”, “Cat6”, “Cat10”, “Cat11”, “Cat12”,“Cat9”, “Cat8”, “Cat7”, “Cat13”, “Cat14”, “Cat15”,“Cat16”, “Cat17”]) box_plot.update_layout(margin=dict(l = 40, r = 10, t = 10, b = 25), width=1100, height=400, font_family=“Arial”) box_plot.update_traces(boxmean=“sd”, selector=dict(type=‘box’)) box_plot.update_traces(pointpos=-2, selector=dict(type=‘box’)) box_plot.update_traces(marker_symbol=“circle-open”, selector=dict(type=‘box’)) box_plot.show()
pio.write_image(box_plot, r"Boxplot_minimal_work_ex.svg")
I tested first with only two categories, and the export file looked fine! But when I increase the number of categories, it makes the bad quality graph. I wonder if there is an influence from setting the width - so I tried to delete the width and heigth setting from the px.box expression but it gave same bad result.

Please explain me meaning of gsub("\047|,","",$0) in awk

From source file, such as;
{'PRIMER_INTERNAL_EXPLAIN': 'considered 3, ok 3',
'PRIMER_INTERNAL_NUM_RETURNED': 0L,
'PRIMER_LEFT_0': (29L, 22L),
'PRIMER_LEFT_0_END_STABILITY': 3.86,
'PRIMER_LEFT_0_GC_PERCENT': 45.45454545454545,
'PRIMER_LEFT_0_HAIRPIN_TH': 0.0,
'PRIMER_LEFT_0_PENALTY': 1.1657103482262485,
'PRIMER_LEFT_0_SELF_ANY_TH': 0.0,
'PRIMER_LEFT_0_SELF_END_TH': 0.0,
'PRIMER_LEFT_0_SEQUENCE': 'ATGGCAAATACACAGAGGAAGC',
'PRIMER_LEFT_0_TM': 58.98043552542492,
'PRIMER_LEFT_1': (32L, 22L),
'PRIMER_LEFT_1_END_STABILITY': 4.35,
'PRIMER_LEFT_1_GC_PERCENT': 45.45454545454545,
'PRIMER_LEFT_1_HAIRPIN_TH': 31.75767174449885,
'PRIMER_LEFT_1_PENALTY': 1.2635420728922853,
'PRIMER_LEFT_1_SELF_ANY_TH': 0.0,
'PRIMER_LEFT_1_SELF_END_TH': 0.0,
'PRIMER_LEFT_1_SEQUENCE': 'GCAAATACACAGAGGAAGCCTT',
'PRIMER_LEFT_1_TM': 58.915214375647565,
'PRIMER_LEFT_2': (27L, 23L),
'PRIMER_LEFT_2_END_STABILITY': 3.46,
'PRIMER_LEFT_2_GC_PERCENT': 43.47826086956522,
'PRIMER_LEFT_2_HAIRPIN_TH': 0.0,
'PRIMER_LEFT_2_PENALTY': 1.396379766521477,
'PRIMER_LEFT_2_SELF_ANY_TH': 0.0,
'PRIMER_LEFT_2_SELF_END_TH': 0.0,
'PRIMER_LEFT_2_SEQUENCE': 'TGATGGCAAATACACAGAGGAAG',
'PRIMER_LEFT_2_TM': 58.735746822319015,
'PRIMER_LEFT_EXPLAIN': 'considered 479, overlap excluded region 66, GC content failed 18, low tm 100, high tm 139, ok 156',
'PRIMER_LEFT_NUM_RETURNED': 3L,
'PRIMER_PAIR_0_COMPL_ANY_TH': 0.0,
'PRIMER_PAIR_0_COMPL_END_TH': 0.0,
'PRIMER_PAIR_0_PENALTY': 3.6468465778591295,
'PRIMER_PAIR_0_PRODUCT_SIZE': 114L,
'PRIMER_PAIR_1_COMPL_ANY_TH': 0.0,
'PRIMER_PAIR_1_COMPL_END_TH': 0.0,
'PRIMER_PAIR_1_PENALTY': 3.744678302525166,
'PRIMER_PAIR_1_PRODUCT_SIZE': 111L,
'PRIMER_PAIR_2_COMPL_ANY_TH': 0.0,
'PRIMER_PAIR_2_COMPL_END_TH': 0.0,
'PRIMER_PAIR_2_PENALTY': 3.877515996154358,
'PRIMER_PAIR_2_PRODUCT_SIZE': 116L,
'PRIMER_PAIR_NUM_RETURNED': 3L,
'PRIMER_RIGHT_0': (142L, 22L),
'PRIMER_RIGHT_0_END_STABILITY': 3.33,
'PRIMER_RIGHT_0_GC_PERCENT': 40.90909090909091,
'PRIMER_RIGHT_0_HAIRPIN_TH': 39.90652312082375,
'PRIMER_RIGHT_0_PENALTY': 2.481136229632881,
'PRIMER_RIGHT_0_SELF_ANY_TH': 0.0,
'PRIMER_RIGHT_0_SELF_END_TH': 0.0,
'PRIMER_RIGHT_0_SEQUENCE': 'AGATGGTGAAACCTGTTTGTTG',
'PRIMER_RIGHT_0_TM': 57.345909180244746,
'PRIMER_RIGHT_1': (142L, 22L),
'PRIMER_RIGHT_1_END_STABILITY': 3.33,
'PRIMER_RIGHT_1_GC_PERCENT': 40.90909090909091,
'PRIMER_RIGHT_1_HAIRPIN_TH': 39.90652312082375,
'PRIMER_RIGHT_1_PENALTY': 2.481136229632881,
'PRIMER_RIGHT_1_SELF_ANY_TH': 0.0,
'PRIMER_RIGHT_1_SELF_END_TH': 0.0,
'PRIMER_RIGHT_1_SEQUENCE': 'AGATGGTGAAACCTGTTTGTTG',
'PRIMER_RIGHT_1_TM': 57.345909180244746,
'PRIMER_RIGHT_2': (142L, 22L),
'PRIMER_RIGHT_2_END_STABILITY': 3.33,
'PRIMER_RIGHT_2_GC_PERCENT': 40.90909090909091,
'PRIMER_RIGHT_2_HAIRPIN_TH': 39.90652312082375,
'PRIMER_RIGHT_2_PENALTY': 2.481136229632881,
'PRIMER_RIGHT_2_SELF_ANY_TH': 0.0,
'PRIMER_RIGHT_2_SELF_END_TH': 0.0,
'PRIMER_RIGHT_2_SEQUENCE': 'AGATGGTGAAACCTGTTTGTTG',
'PRIMER_RIGHT_2_TM': 57.345909180244746,
'PRIMER_RIGHT_EXPLAIN': 'considered 255, overlap excluded region 66, GC content failed 43, low tm 56, high tm 2, long poly-x seq 41, ok 47',
'PRIMER_RIGHT_NUM_RETURNED': 3L,
'SEQUENCE_ID': 'R-chr1:114713809-114714010',
'SEQUENCE_TEMPLATE': 'TAATATCCGCAAATGACTTGCTATTATTGATGGCAAATACACAGAGGAAGCCTTCGCCTGTCCTCATGTATTGGTCTCTCATGGCACTGTACTCTTCTTGTCCAGCTGTATCCAGTATGTCCAACAAACAGGTTTCACCATCTATAACCACTTGTTTTCTGTAAGAATCCTGGGGGTGTggagggtaagggggcagggagg'}
None
{'PRIMER_INTERNAL_EXPLAIN': 'considered 3, ok 3',
'PRIMER_INTERNAL_NUM_RETURNED': 0L,
'PRIMER_LEFT_0': (51L, 23L),
'PRIMER_LEFT_0_END_STABILITY': 4.24,
'PRIMER_LEFT_0_GC_PERCENT': 43.47826086956522,
'PRIMER_LEFT_0_HAIRPIN_TH': 0.0,
'PRIMER_LEFT_0_PENALTY': 1.11245483566546,
'PRIMER_LEFT_0_SELF_ANY_TH': 0.0,
'PRIMER_LEFT_0_SELF_END_TH': 0.0,
'PRIMER_LEFT_0_SEQUENCE': 'ACAAAGTGGTTCTGGATTAGCTG',
'PRIMER_LEFT_0_TM': 58.92503010955636,
'PRIMER_LEFT_1': (54L, 22L),
'PRIMER_LEFT_1_END_STABILITY': 3.86,
'PRIMER_LEFT_1_GC_PERCENT': 45.45454545454545,
'PRIMER_LEFT_1_HAIRPIN_TH': 0.0,
'PRIMER_LEFT_1_PENALTY': 1.2669303718815925,
'PRIMER_LEFT_1_SELF_ANY_TH': 0.0,
'PRIMER_LEFT_1_SELF_END_TH': 0.0,
'PRIMER_LEFT_1_SEQUENCE': 'AAGTGGTTCTGGATTAGCTGGA',
'PRIMER_LEFT_1_TM': 59.087044490345306,
'PRIMER_LEFT_2': (55L, 22L),
'PRIMER_LEFT_2_END_STABILITY': 3.41,
'PRIMER_LEFT_2_GC_PERCENT': 45.45454545454545,
'PRIMER_LEFT_2_HAIRPIN_TH': 0.0,
'PRIMER_LEFT_2_PENALTY': 1.3141607548431367,
'PRIMER_LEFT_2_SELF_ANY_TH': 0.0,
'PRIMER_LEFT_2_SELF_END_TH': 0.0,
'PRIMER_LEFT_2_SEQUENCE': 'AGTGGTTCTGGATTAGCTGGAT',
'PRIMER_LEFT_2_TM': 58.88146858768033,
'PRIMER_LEFT_EXPLAIN': 'considered 507, overlap excluded region 66, GC content failed 60, low tm 128, high tm 78, ok 175',
'PRIMER_LEFT_NUM_RETURNED': 3L,
'PRIMER_PAIR_0_COMPL_ANY_TH': 0.0,
'PRIMER_PAIR_0_COMPL_END_TH': 0.0,
'PRIMER_PAIR_0_PENALTY': 1.9448563397245948,
'PRIMER_PAIR_0_PRODUCT_SIZE': 102L,
'PRIMER_PAIR_1_COMPL_ANY_TH': 0.0,
'PRIMER_PAIR_1_COMPL_END_TH': 0.0,
'PRIMER_PAIR_1_PENALTY': 2.0993318759407273,
'PRIMER_PAIR_1_PRODUCT_SIZE': 99L,
'PRIMER_PAIR_2_COMPL_ANY_TH': 0.0,
'PRIMER_PAIR_2_COMPL_END_TH': 0.0,
'PRIMER_PAIR_2_PENALTY': 2.1465622589022715,
'PRIMER_PAIR_2_PRODUCT_SIZE': 98L,
'PRIMER_PAIR_NUM_RETURNED': 3L,
'PRIMER_RIGHT_0': (152L, 22L),
'PRIMER_RIGHT_0_END_STABILITY': 3.41,
'PRIMER_RIGHT_0_GC_PERCENT': 40.90909090909091,
'PRIMER_RIGHT_0_HAIRPIN_TH': 0.0,
'PRIMER_RIGHT_0_PENALTY': 0.8324015040591348,
'PRIMER_RIGHT_0_SELF_ANY_TH': 0.0,
'PRIMER_RIGHT_0_SELF_END_TH': 0.0,
'PRIMER_RIGHT_0_SEQUENCE': 'TTCTTGCTGGTGTGAAATGACT',
'PRIMER_RIGHT_0_TM': 58.44506566396058,
'PRIMER_RIGHT_1': (152L, 22L),
'PRIMER_RIGHT_1_END_STABILITY': 3.41,
'PRIMER_RIGHT_1_GC_PERCENT': 40.90909090909091,
'PRIMER_RIGHT_1_HAIRPIN_TH': 0.0,
'PRIMER_RIGHT_1_PENALTY': 0.8324015040591348,
'PRIMER_RIGHT_1_SELF_ANY_TH': 0.0,
'PRIMER_RIGHT_1_SELF_END_TH': 0.0,
'PRIMER_RIGHT_1_SEQUENCE': 'TTCTTGCTGGTGTGAAATGACT',
'PRIMER_RIGHT_1_TM': 58.44506566396058,
'PRIMER_RIGHT_2': (152L, 22L),
'PRIMER_RIGHT_2_END_STABILITY': 3.41,
'PRIMER_RIGHT_2_GC_PERCENT': 40.90909090909091,
'PRIMER_RIGHT_2_HAIRPIN_TH': 0.0,
'PRIMER_RIGHT_2_PENALTY': 0.8324015040591348,
'PRIMER_RIGHT_2_SELF_ANY_TH': 0.0,
'PRIMER_RIGHT_2_SELF_END_TH': 0.0,
'PRIMER_RIGHT_2_SEQUENCE': 'TTCTTGCTGGTGTGAAATGACT',
'PRIMER_RIGHT_2_TM': 58.44506566396058,
'PRIMER_RIGHT_EXPLAIN': 'considered 507, overlap excluded region 66, low tm 60, high tm 131, ok 250',
'PRIMER_RIGHT_NUM_RETURNED': 3L,
'SEQUENCE_ID': 'R-chr1:114716023-114716228',
'SEQUENCE_TEMPLATE': 'GTCAGCGGGCTACCACTGGGCCTCACCTCTATGGTGGGATCATATTCATCTACAAAGTGGTTCTGGATTAGCTGGATTGTCAGTGCGCTTTTCCCAACACCACCTGCTCCAACCACCACCAGTTTGTACTCAGTCATTTCACACCAGCAAGAACCTGTTGGAAACCAGTAATCAGGGTTAATTGGCGAGCCACATCTACAGTACT'}
None
If I using this,
grep -E "PRIMER_LEFT_0_SEQUENCE|PRIMER_LEFT_1_SEQUENCE|PRIMER_RIGHT_0_SEQUENCE|PRIMER_RIGHT_1_SEQUENCE|SEQUENCE_ID" p3out_22_59_3pairs_del_No_out.out | paste - - - - - | awk '{ gsub("\047|,","",$0); print ">"$10"-L0\n"$2"\n>"$10"-L1\n"$4"\n>"$10"-R0\n"$6"\n>"$10"-R1\n"$8}' > xgrep_2primers.out
Output is,
>R-chr1:114713809-114714010-L0
ATGGCAAATACACAGAGGAAGC
>R-chr1:114713809-114714010-L1
GCAAATACACAGAGGAAGCCTT
>R-chr1:114713809-114714010-R0
AGATGGTGAAACCTGTTTGTTG
>R-chr1:114713809-114714010-R1
AGATGGTGAAACCTGTTTGTTG
Can anyone please explain me the meaning of { gsub("\047|,","",$0) in awk? Awk is so powerful tool and I'd like to understand more. If you know good material or place to learn awk tool, please share it with me. Thanks in advance!
the \047 is an octal escape. it's actually just a single quote. when working with embedded quotes, it's sometimes easier to just write \047 rather than something like '\''.
as for gsub, it runs the regex '|, (the same as [',]) and deletes all matches (since the 2nd arg is an empty string).
check out the POSIX awk documentation where it describes gsub as:
gsub(ere, repl[, in])
Behave like sub (see below), except that it shall replace all
occurrences of the regular expression (like the ed utility global
substitute) in $0 or in the in argument, when specified.
if you're using gawk, check out the GNU awk manual.
It's the same as tr -d "'," Delete chars single quote and comma.
Replaces '| with nothing. A bare apostrophe would be the end of a string so is substituted with \047

d3.js force layout - filter and restoring nodes by attribute

I have been struggling with this issue for a long time with very little progress. I was going to put my code up here but it becomes quite long and convoluted and I'm still not sure I am even taking the correct approach so I thought backing up, showing the data and saying what I want to accomplish would be a approach to take.
Basically my goal is to create a d3 force layout. All data will be 'hard coded' into the page. I have done some network analysis on the nodes and have included those metrics in the dataset (eigenvector, betweeness, etc.) I want to be able to create the network vis, and have sliders that I can use to filter the network down by the various metrics. In other words I have a range slider that is set to the min and max of the network's “degree” metric (as an example) and as I adjust the slider values the network filters the nodes that fall outside those values. I want to be able to filter these nodes out and back in.
I've seen a number of examples of filtering and most are concerned with reducing the network and don't speak to the restoring the nodes. My attempts have resulted in either nothing happening, or multiple copies of existing nodes being created, or any number of behaviors, but not what I'm after. There is so many ways to 'skin the cat' in d3 that I keep going down paths that don't allow me (or I'm just not able to understand) to control the filtering the way I want.
I don't want to just control the visibility, I want the nodes to be removed and restored completely, and for the network to readjust smoothly.
Here is a sample of the data that I am using ...
var dataset = {
“directed”: false,
“graph”: [],
“nodes”: [
{
“category”: “new”,
“eigen”: 0.05923,
“between”: 0.0,
“close”: 0.25265,
“deg”: 1,
“id”: “Name1”,
“uid”: 100006190145565
},
{
“category”: “known”,
“eigen”: 0.00411,
“between”: 0.002002792177543483,
“close”: 0.19151,
“deg”: 3,
“id”: “Name2”,
“uid”: 100002598631097
},
{
“category”: “new”,
“eigen”: 0.0,
“between”: 0.0,
“close”: 0.06203,
“deg”: 1,
“id”: “Name3”,
“uid”: 727631862
},
{
“category”: “new”,
“eigen”: 0.00725,
“between”: 0.0,
“close”: 0.21037,
“deg”: 1,
“id”: “Name4”,
“uid”: 100008585823128
},
],
“links”: [
{
“source”: 0,
“target”: 1
},
{
“source”: 0,
“target”: 1
},
{
“source”: 0,
“target”: 3
}
],
“multigraph”: false
};
I can supply some of my code as well, but I think it would simply muddy the discussion as I have tried multiple approaches, none of which worked well, each of which just seemed to confuse me further when I would achieve partial results. Any help you can provide would be greatly appreciated.

How to glue months in cal-heatmap

I'm using cal-heatmap to draw github like calendar as you can see in this jsfiddle.
var calendar = new CalHeatMap();
calendar.init({
data: data,
start: new Date(2000, 1),
domain: "month",
subDomain: "day",
range: 3,
scale: [40, 60, 80, 100]
});
Is it possible to remove space between each month (like the github contribution graph)?
I have try the option domainGutter : 0 who is not working on this special case.
Short answer is no (not yet). Each month is in its own "domain", and can't be glued (overlapped in your case), to allow a smoother domain browsing.

Resources