finding ip on site? - proxy

hi i want to grab a proxy list from the net and search through it to find working proxy numbers and port. my problem is when i grab the site how to i search through it it identify just the ips and poorts and disragrd the rest? all i got so far doeint work
how do i identify just the proxy numbers and nothing else?? and sorry any help would be appreciated but i am a newb:)
package proxytester;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
public class ProxyTester{
public static void main(String[] args) {
try{
URL grab = new URL("http://www.example.com");
BufferedReader in = new BufferedReader(
new InputStreamReader(grab.openStream()));
String input;
while ((input = in.readLine()) != null) {
if(input.charAt(0)=='n'){// the site starts its proxy list with name but this line throws an error
System.out.println(input);
}else if(input.charAt(0)== ' '){
System.out.println("empty");
}else
continue;
}
in.close();
}catch(MalformedURLException aa){
System.out.println("site error");
}catch (IOException e) {
System.out.println("io error");
}
}//end main
}//end main

I would suggest using regular expressions to find an ip address and the port. Here is a regular expression that is needed: java regex matching ip address and port number as captured groups
This article explains how to use regular expressions in java: http://www.mkyong.com/regular-expressions/how-to-validate-ip-address-with-regular-expression/

Related

Send a ServerSentEvent from another Method

I'm trying to implement a Server Sent Event Controller for updating my Web Browser Client with the newest Data to display.
This is my current Controller which sends the list of my data every 5 seconds. I want to send a SSE everytime I save my data in another service.
I read about using a channel, but how do I consume it with a Flux?
#GetMapping("/images-sse")
fun getImagesAsSSE(
request: HttpServletRequest
): Flux<ServerSentEvent<MutableList<Image>>> {
val subdomain = request.serverName.split(".").first()
return Flux.interval(Duration.ofSeconds(5))
.map {
ServerSentEvent.builder<MutableList<Image>>()
.event("periodic-event")
.data(weddingService.getBySubdomain(subdomain)?.pictures).build()
}
}
Example code for controller:
package sk.qpp;
import lombok.extern.slf4j.Slf4j;
import org.springframework.http.HttpStatus;
import org.springframework.http.MediaType;
import org.springframework.http.codec.ServerSentEvent;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseBody;
import org.springframework.web.bind.annotation.ResponseStatus;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Sinks;
import java.util.concurrent.atomic.AtomicLong;
#Controller
#Slf4j
public class ReactiveController {
record SomeDTO(String name, String address) {
}
private final Sinks.Many<SomeDTO> eventSink = Sinks.many().multicast().directBestEffort();
#RequestMapping(path = "/sse", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<SomeDTO>> sse() {
final AtomicLong counter = new AtomicLong(0);
return eventSink.asFlux()
.map(e -> ServerSentEvent.builder(e)
.id(counter.incrementAndGet() + "")
//.event(e.getClass().getName())
.build());
}
// note, when you want this to work in production, ensure, that http request is not being cached on its way, using POST method for example.
#ResponseStatus(HttpStatus.OK)
#ResponseBody
#GetMapping(path = "/sendSomething", produces = MediaType.TEXT_PLAIN_VALUE)
public String sendSomething() {
this.eventSink.emitNext(
new SomeDTO("name", "address"),
(signalType, emitResult) -> {
log.warn("Some event is being not send to all subscribers. It will vanish...");
// returning false, to not retry emitting given data again.
return false;
}
);
return "Have a look at /sse endpoint (using \"curl http://localhost/sse\" for example), to see events in realtime.";
}
}
Sink is used as some "custom flux", where you can put anything (using emitNext), and take from it (using asFlux() method).
After setting up sample controller, open http://localhost:9091/sendSomething in your browser (i.e. do GET request on it) and in console issue command curl http://localhost:9091/sse to see your sse events (after each get request, new should come). It is possible also to see sse events directly in chromium browser. Firefox does try to download and save it to filesystem as file (works also).
I finally got it working. I also added user specific updates with a cookie.
Here is my SSE Controller
#RestController
#RequestMapping("/api/sse")
class SSEController {
val imageUpdateSink : Sinks.Many<Wedding> = Sinks.many().multicast().directBestEffort()
#GetMapping("/images")
fun getImagesAsSSE(
request: HttpServletRequest
): Flux<ServerSentEvent<MutableList<Image>>> {
val counter: AtomicLong = AtomicLong(0)
return imageUpdateSink.asFlux()
.filter { wedding ->
val folderId = request.cookies.find {cookie ->
cookie.name == "folderId"
}?.value
folderId == wedding.folderId
}.map { wedding ->
ServerSentEvent.builder<MutableList<Image>>()
.event("new-image")
.data(
wedding.pictures
).id(counter.incrementAndGet().toString())
.build()
}
}
}
In my Service where my data is updated:
val updatedImageList = weddingRepository.findByFolderId(imageDTO.folderId)
sseController.imageUpdateSink.tryEmitNext(
updatedImageList
)
My Javascript looks like this:
document.cookie = "folderId=" + [[${wedding.folderId}]]
const evtSource = new EventSource("/api/sse/images")
evtSource.addEventListener("new-image", function(alpineContext){
return function (event) {
console.log(event.data)
alpineContext.images = JSON.parse(event.data)
};
}(this))

Error When trying to use XSSF on Jmeter

I am getting error when trying to create xlsx file using Jmeter. actually I already try using HSSF (for .xls) and it is works fine. But when I am trying to change it using xlsx, I am getting error. I already copy the jar file for poi and poi-ooxml on jmeter lib file. here is my simple script :
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.*;
import java.lang.String;
import java.lang.Object;
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet sheet = workbook.createSheet("Sample sheet");
Row row = sheet.createRow(0);
Cell cell = row.createCell(0);
cell.setCellValue("HENCIN");
try {
FileOutputStream out = new FileOutputStream(new File("D:\\Jmeter\\testhencin.xlsx"));
workbook.write(out);
out.close();
System.out.println("Excel written successfully..");
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
Actually when I am trying to find the error, the problem are getting from this line :
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet sheet = workbook.createSheet("Sample sheet");
Please someone help me to figure it out. it works on HSSF but on XSSF it is not working. I am getting error :Response code: 500
Response message: org.apache.jorphan.util.JMeterException: Error invoking bsh method: eval org/apache/xmlbeans/XmlObject
I would suggest:
Catching all the possible exceptions and printing the stacktrace to jmeter.log file as well
Re-throwing the exception to make sure you won't get false positive sampler result, something like:
} catch (Throwable e) {
e.printStackTrace();
log.info("Error in Beanshell", e);
throw e;
}
With regards to your question, most likely it is due to missing XMLBeans jar in JMeter classpath. I would suggest the following:
Get "clean" installation of the latest JMeter version
Download the latest version of tika-app.jar and drop it to JMeter's "lib" folder
Restart JMeter to pick the jar up
Using Tika you will get all the necessary libraries bundled, moreover, you JMeter will display content of the binary files in the View Results Tree listener. See How to Extract Data From Files With JMeter article for more details.

Jmeter does not see the variables when trying to connect to the database

I have the following problem.
When you try to create a connection can not find variables.
The test has the following set of actions
Download the local settings file (put in props)
Create a database connection (This happens all in different groups, i tried in the same)
Use next code for upload local property file(Bean Shell, Thread Group 1)
FileInputStream is = new FileInputStream(new File("d:/somefolder/somefile.properties"));
props.load(is);
is.close();
Before creating the connection I checked the availability of variables(Bean Shell, Thread Group 2)
System.out.println(props.get("db.url"));
System.out.println(${__P("db.url")});
${__setProperty("db.url", props.get("db.url"))};
System.out.println(${__P("db.url")});
OutPut
correct connection url
1(Because function __P return default value if variable undefined,
default value = 1)
correct connection url
Create Jdbc Connection with next parametrs(Thread Group 2)
url: ${__P("db.url")}
Test Failure because ${__P("db.url")} return 1
If i use ${__BeanShell(props.get(db.url))}
Test Failure because props.get(db.url) return nothing
If i use ${__javaScript(props.get(db.url))}
Test Failure because props.get(db.url) return nothing
"Component jdbc connection configuration" initialized before started first Thread Group, so component don't see variables because he don't initialize.
Create Script for connecting to db
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.sql.ResultSetMetaData;
ResultSet clientConfigs = null;
ResultSet gameIds = null;
Connection connect = null;
Statement statement = null;
try {
Class.forName(props.get("configdb.driverClassName"));
String connectionUrl = props.get("configdb.url") + "?user=" + props.get("configdb.username") + "&password=" + props.get("configdb.password");
connect = DriverManager.getConnection(connectionUrl);
statement = connect.createStatement();
clientConfigs = statement.executeQuery("Select keyname,valuestr from client_config WHERE valuestr Like '%/_ah/api/%'");
ResultSetMetaData ccMetaData = clientConfigs.getMetaData();
int clientConfigsColumns = ccMetaData.getColumnCount();
ArrayList clientConfigsList = new ArrayList(20);
while (clientConfigs.next()){
HashMap clientConfigsRow = new HashMap(clientConfigsColumns);
for(int i=1; i<=clientConfigsColumns; ++i){
clientConfigsRow.put(ccMetaData.getColumnName(i), clientConfigs.getObject(i));
}
clientConfigsList.add(clientConfigsRow);
}
vars.putObject("clientConfigs", clientConfigsList);
gameIds = statement.executeQuery("Select gameId from game_config");
ResultSetMetaData giMetaData = gameIds.getMetaData();
int gameIdsColumns = giMetaData.getColumnCount();
ArrayList gameIdsList = new ArrayList(50);
while (gameIds.next()){
HashMap gameIdsRow = new HashMap(gameIdsColumns);
for(int i=1; i<=gameIdsColumns; ++i){
gameIdsRow.put(giMetaData.getColumnName(i), gameIds.getObject(i));
}
gameIdsList.add(gameIdsRow);
}
vars.putObject("gameIds", gameIdsList);
}
catch (Exception e) {
throw e;
} finally {
try {
if (clientConfigs != null) {
clientConfigs.close();
}
if (gameIds != null) {
gameIds.close();
}
if (statement != null) {
statement.close();
}
if (connect != null) {
connect.close();
}
} catch (Exception e) {
}
}
Apparently JMeter loads JDBC configuration parameters at really early, likely at UI load. See discussion here
The JDBC Config element is only processed at test startup, so it is
not possible to change the values once a test has started - i.e. you
could not change the values for different loops of the test plan. The
test plan has to know the JDBC settings near the start.
Which means, if you change value of property in threadGroup-1 at runtime, the changes will not take effect in JDBC config.
One possible way around this is to set the property from the command line while starting JMeter.
Alternatively drop JDBC sampler entirely and use one of the custom samplers to interact with your JDBC data source.

Error trying to connect using Astyanax to Cassandra hosted on a EC2 instance

I am getting the following error "astyanax.connectionpool.exceptions.PoolTimeoutException:" when trying to use client Astyanax to connect to Cassandra on a EC2 instance. Need help
Following is my code snippet.
import org.mortbay.jetty.servlet.Context;
import com.netflix.astyanax.AstyanaxContext;
import com.netflix.astyanax.Keyspace;
import com.netflix.astyanax.MutationBatch;
import com.netflix.astyanax.connectionpool.NodeDiscoveryType;
import com.netflix.astyanax.connectionpool.OperationResult;
import com.netflix.astyanax.connectionpool.exceptions.ConnectionException;
import com.netflix.astyanax.connectionpool.impl.ConnectionPoolConfigurationImpl;
import com.netflix.astyanax.connectionpool.impl.ConnectionPoolType;
import com.netflix.astyanax.connectionpool.impl.CountingConnectionPoolMonitor;
import com.netflix.astyanax.impl.AstyanaxConfigurationImpl;
import com.netflix.astyanax.model.Column;
import com.netflix.astyanax.model.ColumnFamily;
import com.netflix.astyanax.model.ColumnList;
import com.netflix.astyanax.model.CqlResult;
import com.netflix.astyanax.serializers.StringSerializer;
import com.netflix.astyanax.thrift.ThriftFamilyFactory;
public class MetadataRS {
public static void main(String args[]){
AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder()
.forCluster("ClusterName")
.forKeyspace("KeyspaceName")
.withAstyanaxConfiguration(new AstyanaxConfigurationImpl()
.setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE)
.setConnectionPoolType(ConnectionPoolType.ROUND_ROBIN)
)
.withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
.setPort(9042)
.setMaxConnsPerHost(40)
.setSeeds("<EC2-IP>:9042")
.setConnectTimeout(5000)
)
.withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
.buildKeyspace(ThriftFamilyFactory.getInstance());
context.start();
Keyspace keyspace = context.getEntity();
System.out.println(keyspace);
ColumnFamily<String, String> CF_USER_INFO =
new ColumnFamily<String, String>(
"Standard1", // Column Family Name
StringSerializer.get(), // Key Serializer
StringSerializer.get()); // Column
OperationResult<ColumnList<String>> result = null;
try {
result = keyspace.prepareQuery(CF_USER_INFO)
.getKey("user_id_hash")
.execute();
} catch (ConnectionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
ColumnList<String> columns = result.getResult();
// Lookup columns in response by name
String uid = columns.getColumnByName("user_id_hash").getStringValue();
System.out.println(uid);
// Or, iterate through the columns
for (Column<String> c : result.getResult()) {
System.out.println(c.getName());
}
}
}
Error
com.netflix.astyanax.thrift.ThriftKeyspaceImpl#1961f4
com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=():9042, latency=5001(5001), attempts=1] Timed out waiting for connection
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:201)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:158)
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:60)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:50)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:229)
at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1.execute(ThriftColumnFamilyQueryImpl.java:180)
at com.rjil.jiodrive.rs.MetadataRS.main(MetadataRS.java:57)
Exception in thread "main" java.lang.NullPointerException
at com.rjil.jiodrive.rs.MetadataRS.main(MetadataRS.java:62)
Since you are running cassandra on EC2 instance, check that the cassandra's port no. (which you have choosen as 9042) is in the allowed list of ec2 security group and that you have access to it. If not add the port no. in the inbound list of the ec2 security group and set the ip ranges as 0.0.0.0.
Alos check the firewall on the ec2 is turned off. By default its false, but its good to check it anyway.
If you have done this, then your client might be behind a firewall that prevents outbound traffic to your choosen port (9042).
Lastly if you have not used any elastic ip, its better to use the ec2 instance dns name, both in your setSeeds section and in the rpc_address of cassandra.yaml
Your issue is not in your code. You have a connectivity problem to the node you specified as your seed. So either that node isn't running, or you can't reach it from the machine running your client.
I finally upgraded libthrift to 0.9 and changed my code to the following nd it working fine now.
public Keyspace getDBConnection() {
if (poolConfig == null) {
poolConfig = new ConnectionPoolConfigurationImpl(
"CassandraPool").setPort(port).setMaxConnsPerHost(1)
.setSeeds((new StringBuilder(seedHost).append(":").append(port).toString()))
.setLatencyAwareUpdateInterval(latencyAwareUpdateInterval) // Will resort hosts per
// token partition every
// 10 seconds
.setLatencyAwareResetInterval(latencyAwareResetInterval) // Will clear the latency
// every 10 seconds. In
// practice I set this
// to 0 which is the
// default. It's better
// to be 0.
.setLatencyAwareBadnessThreshold(latencyAwareBadnessThreshold) // Will sort hosts if a host
// is more than 100% slower
// than the best and always
// assign connections to the
// fastest host, otherwise
// will use round robin
.setLatencyAwareWindowSize(latencyAwareWindowSize) // Uses last 100 latency
// samples. These samples are in
// a FIFO q and will just cycle
// themselves.
.setTimeoutWindow(60000)
;
}
AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder()
.forCluster(clusterName)
.forKeyspace(keyspaceName)
.withAstyanaxConfiguration(
new AstyanaxConfigurationImpl().setDiscoveryType(
NodeDiscoveryType.NONE)
.setConnectionPoolType(
ConnectionPoolType.ROUND_ROBIN)
.setCqlVersion("3.0.0")
.setTargetCassandraVersion("2.0"))
.withConnectionPoolConfiguration(poolConfig)
.withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
.buildKeyspace(ThriftFamilyFactory.getInstance());
context.start();
return context.getClient();
}

Interpreting output from mahout clusterdumper

I ran a clustering test on crawled pages (more than 25K docs ; personal data set).
I've done a clusterdump :
$MAHOUT_HOME/bin/mahout clusterdump --seqFileDir output/clusters-1/ --output clusteranalyze.txt
The output after running cluster dumper is shown 25 elements "VL-xxxxx {}" :
VL-24130{n=1312 c=[0:0.017, 10:0.007, 11:0.005, 14:0.017, 31:0.016, 35:0.006, 41:0.010, 43:0.008, 52:0.005, 59:0.010, 68:0.037, 72:0.056, 87:0.028, ... ] r=[0:0.442, 10:0.271, 11:0.198, 14:0.369, 31:0.421, ... ]}
...
VL-24868{n=311 c=[0:0.042, 11:0.016, 17:0.046, 72:0.014, 96:0.044, 118:0.015, 135:0.016, 195:0.017, 318:0.040, 319:0.037, 320:0.036, 330:0.030, ...] ] r=[0:0.740, 11:0.287, 17:0.576, 72:0.239, 96:0.549, 118:0.273, ...]}
How to interpret this output?
In short : I am looking for document ids which belong to a particular cluster.
What is the meaning of :
VL-x ?
n=y c=[z:z', ...]
r=[z'':z''', ...]
Does 0:0.017 means "0" is the document id which belongs to this cluster?
I already have read on mahout wiki-pages what CL, n, c and r means. But can someone please explain them to me better or points to a resource where it is explained a bit more in detail?
Sorry, if i am asking some stupid questions, but i am a newbie wih apache mahout and using it as part of my course assignment for clustering.
By default, kmeans clustering uses WeightedVector which does not include the data point name. So, you would like to make a sequence file yourself using NamedVector. There is a one to one correspondence between the number of seq files and the mapping tasks. So if your mapping capacity is 12, you want to chop your data into 12 pieces when making seqfiles
NamedVecotr:
vector = new NamedVector(new SequentialAccessSparseVector(Cardinality),arrField[0]);
Basically you need to download the clusteredPoints from your HDFS system and write your own code to output the results. Here is the code that I wrote to output the cluster point membership.
import java.io.*;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.TreeMap;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.mahout.clustering.WeightedVectorWritable;
import org.apache.mahout.common.Pair;
import org.apache.mahout.common.iterator.sequencefile.PathFilters;
import org.apache.mahout.common.iterator.sequencefile.PathType;
import org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterable;
import org.apache.mahout.math.NamedVector;
public class ClusterOutput {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
BufferedWriter bw;
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
File pointsFolder = new File(args[0]);
File files[] = pointsFolder.listFiles();
bw = new BufferedWriter(new FileWriter(new File(args[1])));
HashMap<String, Integer> clusterIds;
clusterIds = new HashMap<String, Integer>(5000);
for(File file:files){
if(file.getName().indexOf("part-m")<0)
continue;
SequenceFile.Reader reader = new SequenceFile.Reader(fs, new Path(file.getAbsolutePath()), conf);
IntWritable key = new IntWritable();
WeightedVectorWritable value = new WeightedVectorWritable();
while (reader.next(key, value)) {
NamedVector vector = (NamedVector) value.getVector();
String vectorName = vector.getName();
bw.write(vectorName + "\t" + key.toString()+"\n");
if(clusterIds.containsKey(key.toString())){
clusterIds.put(key.toString(), clusterIds.get(key.toString())+1);
}
else
clusterIds.put(key.toString(), 1);
}
bw.flush();
reader.close();
}
bw.flush();
bw.close();
bw = new BufferedWriter(new FileWriter(new File(args[2])));
Set<String> keys=clusterIds.keySet();
for(String key:keys){
bw.write(key+" "+clusterIds.get(key)+"\n");
}
bw.flush();
bw.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
To complete the answer:
VL-x: is the identifier of the cluster
n=y: is the number of elements in the cluster
c=[z, ...]: is the centroid of the cluster, with the
z's being the weights of the different dimensions
r=[z, ...]: is the radius of the cluster.
More info here:
https://mahout.apache.org/users/clustering/cluster-dumper.html
I think you need to read the source code -- download from http://mahout.apache.org. VL-24130 is just a cluster identifier for a converged cluster.
You can use mahout clusterdump
https://cwiki.apache.org/MAHOUT/cluster-dumper.html

Resources