Pig Not Interpreting Int Correctly -- Custom Loader - hadoop

So this is my first time to ever use Pig and I'm having a hard time getting it to interpret my data correctly. I dont want to have to define a schema for my input files until run time, so I wrote a super simple custom loader where the only changes I made to PigStorage were changing the GetSchema Method to read the first two lines of my file and create a schema off of it:
public ResourceSchema getSchema(String location,
Job job) throws IOException {
BufferedReader br = new BufferedReader(new FileReader(location.replace("file://", "")));
String[] line = br.readLine().split(",");
String[] data = br.readLine().split(",");
List<FieldSchema> fields = new ArrayList<FieldSchema>();
for(int f = 0; f< line.length; f++)
{
Byte type = GetType(data[f].replace("\"", ""));
fields.add(new FieldSchema(line[f].replace("\"", ""), type));
}
schema = new ResourceSchema(new Schema(fields));
return schema;
}
private Byte GetType(Object Data)
{
try{
int number = Integer.parseInt(Data.toString());
return org.apache.pig.data.DataType.INTEGER;
}
catch(Exception e){}
try{
double dnumber = Double.parseDouble(Data.toString());
return org.apache.pig.data.DataType.DOUBLE;
}
catch(Exception e){}
return org.apache.pig.data.DataType.CHARARRAY;
}
When I load a file and run DESCRIBE on it, it looks like what I want, for instance:
{CU_NUMBER: int,CYCLE_DATE: chararray,JOIN_NUMBER: int,RSSD: int,CU_TYPE: int,CU_NAME: chararray}
And the first 10 Rows look like this:
(1,9/30/2013 0:00:00,2,"50377","1","MORRIS SHEPPARD TEXARKANA")
(5,9/30/2013 0:00:00,6,"859879","1","FIRST CASTLE")
(6,9/30/2013 0:00:00,7,"54571","1","THE NEW ORLEANS FIREMEN'S")
(12,9/30/2013 0:00:00,11,"56678","1","FRANKLIN TRUST")
(13,9/30/2013 0:00:00,12,"861676","1","E")
(16,9/30/2013 0:00:00,14,"59277","1","WOODMEN")
(19,9/30/2013 0:00:00,16,"863773","1","NEW HAVEN TEACHERS")
(22,9/30/2013 0:00:00,17,"61074","1","WATERBURY CONNECTICUT TEACHER")
(26,9/30/2013 0:00:00,19,"866372","1","FARMERS")
(28,9/30/2013 0:00:00,21,"953375","1","CENTRIS")
However, when I try to do stuff with the data like:
FOICU = LOAD 'file:///home/biadmin/NCUA/foicu.txt' USING org.apache.pig.builtin.PigStorageInferSchema(',', '-schema');
FirstSixColumns = FOREACH FOICU GENERATE CU_NUMBER, CYCLE_DATE, JOIN_NUMBER, RSSD, CU_TYPE, CU_NAME;
TopTen = LIMIT FirstSixColumns 10;
FOICUFiltered = FILTER TopTen BY CU_NUMBER > 20;
CU_FIVE = FILTER TopTen BY CU_NUMBER == 5;
DUMP FOICUFiltered;
DUMP CU_FIVE;
FOICUFiltered returns all 10 rows even though 7 of them have a CU_NUMBER less than 20:
(1,9/30/2013 0:00:00,2,"50377","1","MORRIS SHEPPARD TEXARKANA")
(5,9/30/2013 0:00:00,6,"859879","1","FIRST CASTLE")
(6,9/30/2013 0:00:00,7,"54571","1","THE NEW ORLEANS FIREMEN'S")
(12,9/30/2013 0:00:00,11,"56678","1","FRANKLIN TRUST")
(13,9/30/2013 0:00:00,12,"861676","1","E")
(16,9/30/2013 0:00:00,14,"59277","1","WOODMEN")
(19,9/30/2013 0:00:00,16,"863773","1","NEW HAVEN TEACHERS")
(22,9/30/2013 0:00:00,17,"61074","1","WATERBURY CONNECTICUT TEACHER")
(26,9/30/2013 0:00:00,19,"866372","1","FARMERS")
(28,9/30/2013 0:00:00,21,"953375","1","CENTRIS")
And CU_FIVE returns no rows at all.
Does anybody know what I've done wrong here and is there a better way to dynamically load the schema at run time without using schema files?

Related

Hadoop Map Reduce: How to create a reduce function for this?

I hit a brick wall. I have the following files I've generated from previous MR functions.
Product Scores (I have)
0528881469 1.62
0594451647 2.28
0594481813 2.67
0972683275 4.37
1400501466 3.62
where column 1 = product_id, and column 2 = product_rating
Related Products (I have)
0000013714 [0005080789,0005476798,0005476216,0005064341]
0000031852 [B00JHONN1S,B002BZX8Z6,B00D2K1M3O,0000031909]
0000031887 [0000031852,0000031895,0000031909,B00D2K1M3O]
0000031895 [B002BZX8Z6,B00JHONN1S,0000031909,B008F0SU0Y]
0000031909 [B002BZX8Z6,B00JHONN1S,0000031895,B00D2K1M3O]
where column 1 = product_id, and column 2 = array of also_bought products
The file I am trying to create now combines both of these files into the following:
Recommended Products (I need)
0000013714 [<0005080789, 2.34>,<0005476798, 4.58>,<0005476216, 2.32>]
0000031852 [<0005476798, 4.58>,<0005080789, 2.34>,<0005476216, 2.32>]
0000031887 [<0005080789, 2.34>,<0005476798, 4.58>,<0005476216, 2.32>]
0000031895 [<0005476216, 2.32>,<0005476798, 4.58>,<0005080789, 2.34>]
0000031909 [<0005476216, 2.32>,<0005080789, 2.34>,<0005476798, 4.58>]
where column 1 = product_id and column 2 = array of tuples of
I'm just totally stuck at the moment, I thought I had a plan for this but it turned out that it was not a very good plan and it didn't work.
Two approaches based on your size of Product Scores data:
If your Product Scores file is not huge, you can load that up in Hadoop Distributed Cache.(Now available in Jobs itself) Job.addCacheFile()
Then, process the Related Products file and fetch the necessary rating in the Reducer and write it out. Quick and dirty. But, if Product Scores is a huge file then probably not the correct way to go about this problem.
Reduce side Joins. Various examples available, for eg., refer to this link to get an idea.
As you already have defined a schema, you can create hive tables on top of it and get the output using queries. This would save you a lot of time.
Edit: Moreover, If you already have map-reduce jobs ton create this file, you can add hive jobs, which creates external hive tables on these reducer outputs and then query them.
I ended up using a MapFile. I transformed both the ProductScores and RelatedProducts data sets into two MapFiles and then made a Java program that pulled information out of these MapFiles when needed.
MapFileWriter
public class MapFileWriter {
public static void main(String[] args) {
Configuration conf = new Configuration();
Path inputFile = new Path(args[0]);
Path outputFile = new Path(args[1]);
Text txtKey = new Text();
Text txtValue = new Text();
try {
FileSystem fs = FileSystem.get(conf);
FSDataInputStream inputStream = fs.open(inputFile);
Writer writer = new Writer(conf, fs, outputFile.toString(), txtKey.getClass(), txtKey.getClass());
writer.setIndexInterval(1);
while (inputStream.available() > 0) {
String strLineInInputFile = inputStream.readLine();
String[] lstKeyValuePair = strLineInInputFile.split("\\t");
txtKey.set(lstKeyValuePair[0]);
txtValue.set(lstKeyValuePair[1]);
writer.append(txtKey, txtValue);
}
writer.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
MapFileReader
public class MapFileReader {
public static void main(String[] args) {
Configuration conf = new Configuration();
FileSystem fs;
Text txtKey = new Text(args[1]);
Text txtValue = new Text();
MapFile.Reader reader;
try {
fs = FileSystem.get(conf);
try {
reader = new MapFile.Reader(fs, args[0], conf);
reader.get(txtKey, txtValue);
} catch (Exception e) {
e.printStackTrace();
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("The value for Key " + txtKey.toString() + " is " + txtValue.toString());
}
}

How to repeat Job with Partitioner when data is dynamic with Spring Batch?

I am trying to develop a batch process using Spring Batch + Spring Boot (Java config), but I have a problem doing so. I have a software that has a database and a Java API, and I read records from there. The batch process should retrieve all the documents which expiration date is less than a certain date, update the date, and save them again in the same database.
My first approach was reading the records 100 by 100; so the ItemReader retrieve 100 records, I process them 1 by 1, and finally I write them again. In the reader, I put this code:
public class DocumentItemReader implements ItemReader<Document> {
public List<Document> documents = new ArrayList<>();
#Override
public Document read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
if(documents.isEmpty()) {
getDocuments(); // This method retrieve 100 documents and store them in "documents" list.
if(documents.isEmpty()) return null;
}
Document doc = documents.get(0);
documents.remove(0);
return doc;
}
}
So, with this code, the reader reads from the database until no records are found. When the "getDocuments()" method doesn't retrieve any documents, the List is empty and the reader returns null (so the Job finish). Everything worked fine here.
However, the problem appears if I want to use several threads. In this case, I started using the Partitioner approach instead of Multi-threading. The reason of doing that is because I read from the same database, so if I repeat the full step with several threads, all of them will find the same records, and I cannot use pagination (see below).
Another problem is that database records are updated dynamically, so I cannot use pagination. For example, let's suppose I have 200 records, and all of them are going to expire soon, so the process is going to retrieve them. Now imagine I retrieve 10 with one thread, and before anything else, that thread process one and update it in the same database. The next thread cannot retrieve from 11 to 20 records, as the first record is not going to appear in the search (as it has been processed, its date has been updated, and then it doesn't match the query).
It is a little difficult to understand, and some things may sound strange, but in my project:
I am forced to use the same database to read and write.
I can have millions of documents, so I cannot read all the records at the same time. I need to read them 100 by 100, or 500 by 500.
I need to use several threads.
I cannot use pagination, as the query to the databse will retrieve different documents each time it is executed.
So, after hours thinking, I think the unique possible solution is to repeat the job until the query retrives no documents. Is this possible? I want to do something like the step does: Do something until null is returned - repeat the job until the query return zero records.
If this is not a good approach, I will appreciate other possible solutions.
Thank you.
Maybe you can add a partitioner to your step that will :
Select all the ids of the datas that needs to be updated (and other columns if needed)
Split them in x (x = gridSize parameter) partitions and write them in temporary file (1 by partition).
Register the filename to read in the executionContext
Then your reader is not reading from the database anymore but from the partitioned file.
Seem complicated but it's not that much, here is an example which handle millions of record using JDBC query but it can be easily transposed for your use case :
public class JdbcToFilePartitioner implements Partitioner {
/** number of records by database fetch */
private int fetchSize = 100;
/** working directory */
private File tmpDir;
/** limit the number of item to select */
private Long nbItemMax;
#Override
public Map<String, ExecutionContext> partition(final int gridSize) {
// Create contexts for each parttion
Map<String, ExecutionContext> executionsContexte = createExecutionsContext(gridSize);
// Fill partition with ids to handle
getIdsAndFillPartitionFiles(executionsContexte);
return executionsContexte;
}
/**
* #param gridSize number of partitions
* #return map of execution context, one for each partition
*/
private Map<String, ExecutionContext> createExecutionsContext(final int gridSize) {
final Map<String, ExecutionContext> map = new HashMap<>();
for (int partitionId = 0; partitionId < gridSize; partitionId++) {
map.put(String.valueOf(partitionId), createContext(partitionId));
}
return map;
}
/**
* #param partitionId id of the partition to create context
* #return created executionContext
*/
private ExecutionContext createContext(final int partitionId) {
final ExecutionContext context = new ExecutionContext();
String fileName = tmpDir + File.separator + "partition_" + partitionId + ".txt";
context.put(PartitionerConstantes.ID_GRID.getCode(), partitionId);
context.put(PartitionerConstantes.FILE_NAME.getCode(), fileName);
if (contextParameters != null) {
for (Entry<String, Object> entry : contextParameters.entrySet()) {
context.put(entry.getKey(), entry.getValue());
}
}
return context;
}
private void getIdsAndFillPartitionFiles(final Map<String, ExecutionContext> executionsContexte) {
List<BufferedWriter> fileWriters = new ArrayList<>();
try {
// BufferedWriter for each partition
for (int i = 0; i < executionsContexte.size(); i++) {
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(executionsContexte.get(String.valueOf(i)).getString(
PartitionerConstantes.FILE_NAME.getCode())));
fileWriters.add(bufferedWriter);
}
// Fetching the datas
ScrollableResults results = runQuery();
// Get the result and fill the files
int currentPartition = 0;
int nbWriting = 0;
while (results.next()) {
fileWriters.get(currentPartition).write(results.get(0).toString());
fileWriters.get(currentPartition).newLine();
currentPartition++;
nbWriting++;
// If we already write on all partitions, we start again
if (currentPartition >= executionsContexte.size()) {
currentPartition = 0;
}
// If we reach the max item to read we stop
if (nbItemMax != null && nbItemMax != 0 && nbWriting >= nbItemMax) {
break;
}
}
// closing
results.close();
session.close();
for (BufferedWriter bufferedWriter : fileWriters) {
bufferedWriter.close();
}
} catch (IOException | SQLException e) {
throw new UnexpectedJobExecutionException("Error writing partition file", e);
}
}
private ScrollableResults runQuery() {
...
}
}

Loading Files in UDF

I have a requirement of populating a field based on the evaluation of a UDF. The input to the UDF would be some other fields in the input and as well as an csv sheet. Presently, the approach I have taken is to load the CSV file, group it ALL and then pass it as a bag to the UDF along with other required parameters. However, its taking a very long time to complete the process (roughly about 3 hours) for source data of 170k records and as well as csv records of about 150k.
I'm sure there must be much better efficient way to handle this and hence need your inputs.
source_alias = LOAD 'src.csv' USING
PigStorage(',') AS (f1:chararray,f2:chararray,f3:chararray);
csv_alias = LOAD 'csv_file.csv' USING
PigStorage(',') AS (c1:chararray,c2:chararray,c3:chararray);
grpd_csv_alias = GROUP csv_alias ALL;
final_alias = FOREACH source_alias GENERATE f1 AS f1,
myUDF(grpd_csv_alias, f2) AS derived_f2;
Here is my UDF on a high level.
public class myUDF extends EvalFunc<String> {
public String exec(Tuple input) throws IOException {
String f2Response = "N";
DataBag csvAliasBag = (DataBag)input.get(0);
String f2 = (String) input.get(1);
try {
Iterator<Tuple> bagIterator = csvAliasBag.iterator();
while (bagIterator.hasNext()) {
Tuple localTuple = (Tuple)bagIterator.next();
String col1 = ((String)localTuple.get(1)).trim().toLowerCase();
String col2 = ((String)localTuple.get(2)).trim().toLowerCase();
String col3 = ((String)localTuple.get(3)).trim().toLowerCase();
String col4 = ((String)localTuple.get(4)).trim().toLowerCase();
<Custom logic to populate f2Response based on the value in f2 and as well as col1, col2, col3 and col4>
}
}
return f2Response;
}
catch(Exception e){
throw new IOException("Caught exception processing input row ", e);
}
}
}
I believe the process is taking too long because of building and passing csv_alias to the UDF for each row in the source file.
Is there any better way to handle this?
Thanks
For small files, you can put them on the distributed cache. This copies the file to each task node as a local file then you load it yourself. Here's an example from the Pig docs UDF section. I would not recommend parsing the file each time, however. Store your results in a class variable and check to see if it's been initialized. If the csv is on the local file system, use getShipFiles. If the csv you're using is on HDFS, used the getCachedFiles method. Notice that for HDFS there's a file path followed by a # and some text. To the left of the # is the HDFS path and to the right is the name you want it to be called when it's copied to the local file system.
public class Udfcachetest extends EvalFunc<String> {
public String exec(Tuple input) throws IOException {
String concatResult = "";
FileReader fr = new FileReader("./smallfile1");
BufferedReader d = new BufferedReader(fr);
concatResult +=d.readLine();
fr = new FileReader("./smallfile2");
d = new BufferedReader(fr);
concatResult +=d.readLine();
return concatResult;
}
public List<String> getCacheFiles() {
List<String> list = new ArrayList<String>(1);
list.add("/user/pig/tests/data/small#smallfile1"); // This is hdfs file
return list;
}
public List<String> getShipFiles() {
List<String> list = new ArrayList<String>(1);
list.add("/home/hadoop/pig/smallfile2"); // This local file
return list;
}
}

NHib 3 Configuration & Mapping returning empty results?

Note: I'm specifically not using Fluent NHibernate but am using 3.x's built-in mapping style. However, I am getting a blank recordset when I think I should be getting records returned.
I'm sure I'm doing something wrong and it's driving me up a wall. :)
Background / Setup
I have an Oracle 11g database for a product by IBM called Maximo
This product has a table called workorder which lists workorders; that table has a field called "wonum" which represents a unique work order number.
I have a "reporting" user which can access the table via the maximo schema
e.g. "select * from maximo.workorder"
I am using Oracle's Managed ODP.NET DLL to accomplish data tasks, and using it for the first time.
Things I've Tried
I created a basic console application to test this
I added the OracleManagedClientDriver.cs from the NHibernate.Driver on the master branch (it is not officially in the release I'm using).
I created a POCO called WorkorderBriefBrief, which only has a WorkorderNumber field.
I created a class map, WorkorderBriefBriefMap, which maps only that value as a read-only value.
I created a console application with console output to attempt to write the lines of work orders.
The session and transaction appear to open correct,
I tested a standard ODP.NET OracleConnection to my connection string
The Code
POCO: WorkorderBriefBrief.cs
namespace PEApps.Model.WorkorderQuery
{
public class WorkorderBriefBrief
{
public virtual string WorkorderNumber { get; set; }
}
}
Mapping: WorkorderBriefBriefMap.cs
using NHibernate.Mapping.ByCode;
using NHibernate.Mapping.ByCode.Conformist;
using PEApps.Model.WorkorderQuery;
namespace ConsoleTests
{
public class WorkorderBriefBriefMap : ClassMapping<WorkorderBriefBrief>
{
public WorkorderBriefBriefMap()
{
Schema("MAXIMO");
Table("WORKORDER");
Property(x=>x.WorkorderNumber, m =>
{
m.Access(Accessor.ReadOnly);
m.Column("WONUM");
});
}
}
}
Putting it Together: Program.cs
namespace ConsoleTests
{
class Program
{
static void Main(string[] args)
{
NHibernateProfiler.Initialize();
try
{
var cfg = new Configuration();
cfg
.DataBaseIntegration(db =>
{
db.ConnectionString = "[Redacted]";
db.Dialect<Oracle10gDialect>();
db.Driver<OracleManagedDataClientDriver>();
db.KeywordsAutoImport = Hbm2DDLKeyWords.AutoQuote;
db.BatchSize = 500;
db.LogSqlInConsole = true;
})
.AddAssembly(typeof(WorkorderBriefBriefMap).Assembly)
.SessionFactory().GenerateStatistics();
var factory = cfg.BuildSessionFactory();
List<WorkorderBriefBrief> query;
using (var session = factory.OpenSession())
{
Console.WriteLine("session opened");
Console.ReadLine();
using (var transaction = session.BeginTransaction())
{
Console.WriteLine("transaction opened");
Console.ReadLine();
query =
(from workorderbriefbrief in session.Query<WorkorderBriefBrief>() select workorderbriefbrief)
.ToList();
transaction.Commit();
Console.WriteLine("Transaction Committed");
}
}
Console.WriteLine("result length is {0}", query.Count);
Console.WriteLine("about to write WOs");
foreach (WorkorderBriefBrief wo in query)
{
Console.WriteLine("{0}", wo.WorkorderNumber);
}
Console.WriteLine("DONE!");
Console.ReadLine();
// Test a standard connection below
string constr = "[Redacted]";
OracleConnection con = new OracleConnection(constr);
con.Open();
Console.WriteLine("Connected to Oracle Database {0}, {1}", con.ServerVersion, con.DatabaseName.ToString());
con.Dispose();
Console.WriteLine("Press RETURN to exit.");
Console.ReadLine();
}
catch (Exception ex)
{
Console.WriteLine("Error : {0}", ex);
Console.ReadLine();
}
}
}
}
Thanks in advance for any help you can give!
Update
The following code (standard ADO.NET with OracleDataReader) works fine, returning the 16 workorder numbers that it should. To me, this points to my use of NHibernate more than the Oracle Managed ODP.NET. So I'm hoping it's just something stupid that I did above in the mapping or configuration.
// Test a standard connection below
string constr = "[Redacted]";
OracleConnection con = new Oracle.ManagedDataAccess.Client.OracleConnection(constr);
con.Open();
Console.WriteLine("Connected to Oracle Database {0}, {1}", con.ServerVersion, con.DatabaseName);
var cmd = new OracleCommand();
cmd.Connection = con;
cmd.CommandText = "select wonum from maximo.workorder where upper(reportedby) = 'MAXADMIN'";
cmd.CommandType = CommandType.Text;
Oracle.ManagedDataAccess.Client.OracleDataReader reader = cmd.ExecuteReader();
while (reader.Read())
{
Console.WriteLine(reader.GetString(0));
}
con.Dispose();
When configuring NHibernate, you need to tell it about your mappings.
I found the answer -- thanks to Oskar's initial suggestion, I realized it wasn't just that I hadn't added the assembly, I also needed to create a new mapper.
to do this, I added the following code to the configuration before building my session factory:
var mapper = new ModelMapper();
//define mappingType(s) -- could be an array; in my case it was just 1
var mappingType = typeof (WorkorderBriefBriefMap);
//use AddMappings instead if you're mapping an array
mapper.AddMapping(mappingType);
//add the compiled results of the mapper to the configuration
cfg.AddMapping(mapper.CompileMappingForAllExplicitlyAddedEntities());
var factory = cfg.BuildSessionFactory();

Update object in foreach loop

I am using EF4/LINQ for the first time and have run into an issue. I am looping thru the results of a LINQ query using a foreach loop as follows:
private static void OnTimedEvent(object source, ElapsedEventArgs e)
{
CallOutcomeSubmission los = new CallOutcomeSubmission();
client = connectToService();
try
{
using (var context = new CallOutcomeContext())
{
// List of available actions
private static string ACTION_CALL_ATTEMPT = "Call Attempt";
DateTime oneDayAgo = DateTime.Now.AddHours(-24);
var query = from co in context.T_MMCallOutcome
join ca in context.T_Call on co.CallID equals ca.CallID
join lv in context.T_LeadVendorEmailHeader on co.LeadVendorEmailID equals lv.LeadVendorEmailID
where co.EnteredOn > oneDayAgo && co.MMLeadActionID == null
select new
{
co.CallOutcomeID,
co.CallID,
co.LeadVendorEmailID,
MMLeadID = lv.email_text,
ca.OutcomeID,
lv.FranchiseNumber,
co.MMLeadActionID,
co.LeadAction
};
// if any results found for query
if (query.Any())
{
foreach (var call in query.ToList())
{
// if the franchise exists
if (client.FranchiseExists(int.Parse(call.FranchiseNumber)))
{
switch (call.OutcomeID)
{
case 39: // Not Answered
call.LeadAction = ACTION_CALL_ATTEMPT;
break;
case 43: // Remove from Call List
break;
default: // If the OutcomeID is not identified in the case statement
break;
} // switch
}
else
{
los.eventLog.WriteEntry("CallOutcomeSubmission.OnTimedEvent: No franchise found with franchise ID " + call.FranchiseNumber);
}
// Save any changes currently on context
context.SaveChanges();
} // foreach
}
// if no results found from query write system log stating such
else
{
los.eventLog.WriteEntry("CallOutcomeSubmission.OnTimedEvent: No new entries found");
}
} // using
client.Close();
}
catch (System.TimeoutException exception)
{
los.eventLog.WriteEntry("CallOutcomeSubmission.OnTimedEvent:" + exception.ToString());
client.Abort();
}
catch (System.ServiceModel.CommunicationException exception)
{
los.eventLog.WriteEntry("CallOutcomeSubmission.OnTimedEvent:" + exception.ToString());
client.Abort();
}
}
When I try to do the assignment:
call.LeadAction = ACTION_CALL_ATTEMPT;
I get a build error of
Property or indexer 'AnonymousType#2.LeadAction' cannot be assigned to -- it is read only
I can't seem to find anything on this specific error doing a Google search and am not sure what I am doing wrong. Is it because the original query contains a join?
How can I do the assignment of call.LeadAction within the foreach loop?
I would also like to know if there are design issue withe way I have written the query or performed any of the operations since this is my first foray into EF/LINQ.
You're creating a new anonymous type - with the Linq joins and then trying to set that value. What you're really wanting to do, is update the call's LeadAction correct?
How would EF know to translate your new query back to an entity so it can go back to the database? It would have to go through alot of hoops, and it's not capable of that.
What you could do, is retrieve the Call from your database and set the LeadAction that way - I'm using Find, assuming that CallID is your PK:
case 39: // Not Answered
var thisCall = context.T_Call.Find(call.CallID)
thisCall.LeadAction = ACTION_CALL_ATTEMPT;
break;

Resources