Getting Netty client related error in storm topology and worker restarting - apache-storm

Version Info:
"org.apache.storm" % "storm-core" % "1.2.1"
"org.apache.storm" % "storm-kafka-client" % "1.2.1"
I have a storm Topology with 3 bolts(A,B,C), Where the middle bolt takes around 450ms mean time and other two bolts takes less than 1ms.
I am running topology with following parallelism hint values on two machines:
A: 4
B: 700
C: 10
I am getting following error after few minutes of topology starting:
in worker log:
2018-07-04T20:16:28.835+05:30 Client [ERROR] discarding 7 messages because the Netty client to Netty-Client-/ip:6700 is being closed
in supervisor logs:
2018-07-04 20:16:29.468 o.a.s.d.s.BasicContainer [INFO] Worker Process 32bc11c0-a1d0-4593-a91a-3ff788ea041a exited with code: 20
2018-07-04 20:16:31.592 o.a.s.d.s.Slot [WARN] SLOT 6700: main process has exited
2018-07-04 20:16:31.592 o.a.s.d.s.Container [INFO] Killing 2825cbe9-aedd-4f10-a796-4f9dc30ae72f:32bc11c0-a1d0-4593-a91a-3ff788ea041a
2018-07-04 20:16:31.600 o.a.s.u.Utils [INFO] Error when trying to kill 7422. Process is probably already dead.
2018-07-04 20:16:32.600 o.a.s.d.s.Slot [INFO] STATE RUNNING msInState: 391195 topo:myTopo-1-1530715184 worker:32bc11c0-a1d0-4593-a91a-3ff788ea041a -> KILL_AND_RELAUNCH msInState: 0 topo:myTopo-1-1530715184 worker:32bc11c0-a1d0-4593-a91a-3ff788ea041a
2018-07-04 20:16:32.600 o.a.s.d.s.Container [INFO] GET worker-user for 32bc11c0-a1d0-4593-a91a-3ff788ea041a
I see similar question asked here and here, I have few queries related to this:
Why is this error coming and how to resolve?
How to get more debug information from Storm, I have already set conf.setDebug(true)
Is there some limitation/guidelines around how much parallelism factor os OK for a bolt on n number of machines?
Edit:
Logs for strace -fp PID -e trace=read,write,network,signal,ipc in gist. Some relevant looking part is when above thing happends, but however I see such SIGSEGV many places in strace output :
[pid 23635] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7f83af6f1180} ---
[pid 23549] <... read resumed> "PK\3\4\n\0\0\0\10\0\364J\336F\222'\202\312\310\2\0\0\16\5\0\0\36\0\0\0", 30) = 30
[pid 23654] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7f83af6f1f80} ---
[pid 23549] read(23, "\235TmW\22A\24~\6\224\227u\vE4\255,JR\300WP\322\0245TH\23\313\3j\347"..., 712) = 712
[pid 23654] rt_sigreturn({mask=[QUIT]}) = 140203560738688
[pid 23635] rt_sigreturn({mask=[QUIT]}) = 140203560735104
strace output of worker process is here, relevant looking logs here:
[pid 24435] recvfrom(291, "HTTP/1.1 200 OK\r\nContent-Type: a"..., 8192, 0, NULL, NULL) = 544
[pid 23473] write(3, "Heap\n garbage-first heap total"..., 347) = 347
[pid 24434] +++ exited with 20 +++
[pid 24405] +++ exited with 20 +++
[pid 24435] +++ exited with 20 +++
[pid 24427] +++ exited with 20 +++
Edit 2:
There is this question as well: Connection refused error in worker logs - apache storm : as par it's answer not setting storm.local.hostname might cause it, but it is already set for me.
There is another bug filed here having similar netty error, which is also still unresolved.

Related

Maven Spark Source Code Build Fails in Ubuntu 20.04

I am trying to build Spark 3.1.1 source code using mvn build as below:
./build/mvn -DskipTests clean package
However build fails without giving any proper error as below
killed 13456 "${MVN_BIN}" -DzincPort=${ZINC_PORT} "$#"
Any help is much appreciated.
My environment is as below:
OS: Ubuntu 20.04
Java : Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=1G; support was removed in 8.0
java version "1.8.0_271"
Java(TM) SE Runtime Environment (build 1.8.0_271-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.271-b09, mixed mode)
spark Version : Spark 3.1.1
I encountered the same problem. After investigation, it was confirmed that the Linux oom-killer killed the process and the reason was insufficient memory.
Maven Log:
[INFO] --- scala-maven-plugin:4.5.6:compile (scala-compile-first) # spark-hive-thriftserver_2.12 ---
[INFO] Using incremental compilation using Mixed compile order
[INFO] Compiler bridge file: /home/tianshuang/.sbt/1.0/zinc/org.scala-sbt/org.scala-sbt-compiler-bridge_2.12-1.5.8-bin_2.12.15__52.0-1.5.8_20211211T222914.jar
[INFO] compiler plugin: BasicArtifact(com.github.ghik,silencer-plugin_2.12.15,1.7.6,null)
[INFO] compiling 27 Scala sources and 86 Java sources to /home/tianshuang/IdeaProjects/latest/spark/sql/hive-thriftserver/target/scala-2.12/classes ...
./build/mvn: line 185: 8772 Killed "${MVN_BIN}" "$#"
Cmd:
dmesg -T | grep -i kill
Output:
[Wed Apr 13 22:07:44 2022] fsnotifier invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[Wed Apr 13 22:07:44 2022] oom_kill_process.cold+0xb/0x10
[Wed Apr 13 22:07:44 2022] [ 1862] 1000 1862 114275 95 110592 117 0 gsd-rfkill
[Wed Apr 13 22:07:44 2022] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user#1000.service,task=java,pid=8772,uid=1000
[Wed Apr 13 22:07:44 2022] Out of memory: Killed process 8772 (java) total-vm:15551840kB, anon-rss:4862120kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:10256kB oom_score_adj:0

docker installation of openproject: Phusion passenger fails to start after installation

I am trying to install openproject using docker on centos7.6 but Phusion passenger fails to start after installation. Error is suggesting it failed to parse response.
The preloader process sent an unparseable response:. I don't know how to fix this issue.
stdout:
-----> Database setup finished.
On first installation, the default admin credentials are login: admin, password: admin
-----> Launching supervisord...
2019-05-08 08:14:46,313 CRIT Supervisor running as root (no user in config file)
2019-05-08 08:14:46,318 INFO supervisord started with pid 1
2019-05-08 08:14:47,321 INFO spawned: 'postgres' with pid 155
2019-05-08 08:14:47,325 INFO spawned: 'apache2' with pid 156
2019-05-08 08:14:47,328 INFO spawned: 'web' with pid 157
2019-05-08 08:14:47,331 INFO spawned: 'worker' with pid 158
2019-05-08 08:14:47,351 INFO spawned: 'postfix' with pid 159
2019-05-08 08:14:47,360 INFO spawned: 'memcached' with pid 160
2019-05-08 08:14:47.634 UTC [172] LOG: database system was shut down at 2019-05-08 08:14:44 UTC
2019-05-08 08:14:47,634 INFO success: postfix entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2019-05-08 08:14:47.649 UTC [172] LOG: MultiXact member wraparound protections are now enabled
2019-05-08 08:14:47.653 UTC [155] LOG: database system is ready to accept connections
2019-05-08 08:14:47.663 UTC [177] LOG: autovacuum launcher started
2019-05-08 08:14:48,670 INFO success: postgres entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-05-08 08:14:48,670 INFO success: apache2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-05-08 08:14:48,670 INFO success: web entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-05-08 08:14:48,670 INFO success: worker entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-05-08 08:14:48,670 INFO success: memcached entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.2. Set the 'ServerName' directive globally to suppress this message
2019-05-08 08:14:50,198 INFO exited: postfix (exit status 0; expected)
--> Downloading a Phusion Passenger agent binary for your platform
--> Installing Nginx 1.15.8 engine
--------------------------
[passenger_native_support.so] trying to compile for the current user (app) and Ruby interpreter...
(set PASSENGER_COMPILE_NATIVE_SUPPORT_BINARY=0 to disable)
Compilation successful. The logs are here:
/tmp/passenger_native_support-15tsfhk.log
[passenger_native_support.so] successfully loaded.
=============== Phusion Passenger Standalone web server started ===============
PID file: /app/tmp/pids/passenger.8080.pid
Log file: /app/log/passenger.8080.log
Environment: production
Accessible via: http://0.0.0.0:8080/
You can stop Phusion Passenger Standalone by pressing Ctrl-C.
Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/
===============================================================================
[ N 2019-05-08 08:15:01.7338 404/Tb age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours)
Forcefully loading the application. Use :environment to avoid eager loading.
[auth_saml] Missing settings from '/app/config/plugins/auth_saml/settings.yml', skipping omniauth registration.
hook registered
App 439 output: [auth_saml] Missing settings from '/app/config/plugins/auth_saml/settings.yml', skipping omniauth registration.
App 439 output: hook registered
Creating scope :order_by_name. Overwriting existing method Sprint.order_by_name.
App 439 output: Creating scope :order_by_name. Overwriting existing method Sprint.order_by_name.
[Worker(host:d0b3748f627a pid:158)] Starting job worker
2019-05-08T08:15:45+0000: [Worker(host:d0b3748f627a pid:158)] Starting job worker
App 439 output: /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `fork': Cannot allocate memory - fork(2) (Errno::ENOMEM)
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `handle_spawn_command'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:78:in `accept_and_process_next_client'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:167:in `run_main_loop'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:207:in `<module:App>'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:30:in `<module:PhusionPassenger>'
App 439 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:29:in `<main>'
[ E 2019-05-08 08:15:46.6971 404/Tc age/Cor/App/Implementation.cpp:221 ]: Could not spawn process for application /app: The preloader process sent an unparseable response:
Error ID: d7825364
Error details saved to: /tmp/passenger-error-wjSTKF.html
[ E 2019-05-08 08:15:46.7028 404/T8 age/Cor/Con/CheckoutSession.cpp:276 ]: [Client 1-1] Cannot checkout session because a spawning error occurred. The identifier of the error is d7825364. Please see earlier logs for details about the error.
[ W 2019-05-08 08:34:24.7967 404/Tk age/Cor/Spa/SmartSpawner.h:572 ]: An error occurred while spawning an application process: Cannot connect to Unix socket '/tmp/passenger.PKROzbY/apps.s/preloader.hyl9g8': No such file or directory (errno=2)
[ W 2019-05-08 08:34:24.7968 404/Tk age/Cor/Spa/SmartSpawner.h:574 ]: The application preloader seems to have crashed, restarting it and trying again...
App 543 output: [auth_saml] Missing settings from '/app/config/plugins/auth_saml/settings.yml', skipping omniauth registration.
App 543 output: hook registered
App 543 output: Creating scope :order_by_name. Overwriting existing method Sprint.order_by_name.
App 543 output: /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `fork': Cannot allocate memory - fork(2) (Errno::ENOMEM)
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `handle_spawn_command'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:78:in `accept_and_process_next_client'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:167:in `run_main_loop'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:207:in `<module:App>'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:30:in `<module:PhusionPassenger>'
App 543 output: from /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/helper-scripts/rack-preloader.rb:29:in `<main>'
[ E 2019-05-08 08:34:52.2521 404/Tk age/Cor/App/Implementation.cpp:221 ]: Could not spawn process for application /app: The preloader process sent an unparseable response:
Error ID: c2ce0823
Error details saved to: /tmp/passenger-error-bpsfAC.html
[ E 2019-05-08 08:34:52.2570 404/T8 age/Cor/Con/CheckoutSession.cpp:276 ]: [Client 1-2] Cannot checkout session because a spawning error occurred. The identifier of the error is c2ce0823. Please see earlier logs for details about the error.
Thanks.
The import line in the log is this one:
App 439 output: /app/vendor/bundle/ruby/2.6.0/gems/passenger-6.0.1/src/ruby_supportlib/phusion_passenger/preloader_shared_helpers.rb:108:in `fork': Cannot allocate memory - fork(2) (Errno::ENOMEM)
This means your container is unable to allocate necessary memory. It could be that your system is in a OOM state and things are being killed or due to some other restriction on the daemon that prevents it from allocating additional memory
For reference:
https://success.docker.com/article/docker-daemon-error-cannot-allocate-memory

Hadoop Container failed even 100 percent completed

I have setup a small cluster Hadoop 2.7, Hbase 0.98 and Nutch 2.3.1. I have wrote a custom job that simple first combine docs of same domain, after that each URL of domain (from cache i.e., a list) is first obtained from from cache and then corresponding key is used to fetched the object via datastore.get(url_key) and then after updating score, it is written via context.write.
The job should complete after all docs are processed but what I have observed that each attempt if failed due to timeout and progress is 100 percent complete show. Here is the LOG
attempt_1549963404554_0110_r_000001_1 100.00 FAILED reduce > reduce node2:8042 logs Thu Feb 21 20:50:43 +0500 2019 Fri Feb 22 02:11:44 +0500 2019 5hrs, 21mins, 0sec AttemptID:attempt_1549963404554_0110_r_000001_1 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
attempt_1549963404554_0110_r_000001_3 100.00 FAILED reduce > reduce node1:8042 logs Fri Feb 22 04:39:08 +0500 2019 Fri Feb 22 07:25:44 +0500 2019 2hrs, 46mins, 35sec AttemptID:attempt_1549963404554_0110_r_000001_3 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
attempt_1549963404554_0110_r_000002_0 100.00 FAILED reduce > reduce node3:8042 logs Thu Feb 21 12:38:45 +0500 2019 Thu Feb 21 22:50:13 +0500 2019 10hrs, 11mins, 28sec AttemptID:attempt_1549963404554_0110_r_000002_0 Timed out after 1800 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
What it is so i.e., when an attempt is 100.00 percent complete then it should be marked as successfull. Unfortunately, there is any error information other than timeout for my case. How to debug this problem ?
My reducer is somewhat posted to another question
Apache Nutch 2.3.1 map-reduce timeout occurred while updating the score
I have observed that, in the mentioned 3 logs the time required for execution is varied with big difference. Please look upto the job which you are executing once.

macOS Server 5.3 Calendar pg_ctl not starting

After updating macOS Server to 5.3 (running on macOS 10.12.4) my Calendar & Contacts have stopped syncing.
It seems that it's having trouble starting Postgres for cluster /Library/Server/Calendar and Contacts/Data/Database.xpg/cluster.pg and possibly trouble with the agent too.
The GUI seems to think that the Calendar & Contacts services have started and are available, but when I run $ sudo serveradmin fullstatus calendar from the command line I get:
calendar:setStateVersion = 1
calendar:readWriteSettingsVersion = 1
calendar:state = "STARTING"
calendar:contactsState = "STARTING"
calendar:calendarState = "STARTING"
System log is being spammed with:
Apr 22 11:58:42 com.apple.xpc.launchd[1] (org.calendarserver.agent[44649]): Service exited with abnormal code: 1
Apr 22 11:58:42 com.apple.xpc.launchd[1] (org.calendarserver.agent): Service only ran for 0 seconds. Pushing respawn out by 10 seconds.
Apr 22 11:58:52 com.apple.xpc.launchd[1] (org.calendarserver.agent[44659]): Service exited with abnormal code: 1
Apr 22 11:58:52 com.apple.xpc.launchd[1] (org.calendarserver.agent): Service only ran for 0 seconds. Pushing respawn out by 10 seconds.
Apr 22 11:59:02 com.apple.xpc.launchd[1] (org.calendarserver.agent[44668]): Service exited with abnormal code: 1
Apr 22 11:59:02 com.apple.xpc.launchd[1] (org.calendarserver.agent): Service only ran for 0 seconds. Pushing respawn out by 10 seconds.
Apr 22 11:59:07 com.apple.xpc.launchd[1] (org.calendarserver.calendarserver[44676]): Service exited with abnormal code: 1
Apr 22 11:59:07 com.apple.xpc.launchd[1] (org.calendarserver.calendarserver): Service only ran for 0 seconds. Pushing respawn out by 60 seconds.
Here's the output of $ sudo /Applications/Server.app/Contents/ServerRoot/usr/sbin/calendarserver_diagnose
Any ideas?
OS Build: 16E195
Server Build: 16S4123
/Library/Server/Preferences/Calendar.plist exists and can be parsed
Prefs plist says ServerRoot directory is: /Library/Server/Calendar and Contacts
ServerRoot volume ok
/Library/Server/Calendar and Contacts/Config/caldavd-system.plist exists and can be parsed
/Library/Server/Calendar and Contacts/Config/caldavd-user.plist does not exist
Configuration:
Calendar and Contacts service processes:
USER PID %CPU %MEM RSS ELAPSED STARTED COMMAND
root 42554 0.0 0.1 11072 07:49 Sat 22 Apr 11:32:16 2017 servermgr_calendar
Serverd status:
org.calendarserver.agent is enabled
org.calendarserver.calendarserver is enabled
org.calendarserver.relocate is enabled
Disk space on boot volume:
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk1 999G 777G 222G 78% 8520180 4286447099 0% /
Disk space on service data volume:
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk1 999G 777G 222G 78% 8520180 4286447099 0% /
Disk space used by Calendar and Contacts service:
20K /Library/Server/Calendar and Contacts/Config
1014M /Library/Server/Calendar and Contacts/Data
200M /Library/Server/Calendar and Contacts/Logs
Postgres status for cluster /Library/Server/Calendar and Contacts/Data/Database.xpg/cluster.pg:
pg_ctl: no server running
Agent:
Attempting to send a request to the agent...
Can't connect to agent: timed out
Server connection:
Traceback (most recent call last):
File "/Applications/Server.app/Contents/ServerRoot/usr/sbin/calendarserver_diagnose", line 14, in <module>
load_entry_point('CalendarServer==9.1a1.dev0+56b4197875debefef19d9c19840f903a8e480c88.head', 'console_scripts', 'calendarserver_diagnose')()
File "/Applications/Server.app/Contents/ServerRoot/Library/CalendarServer/lib/python2.7/site-packages/calendarserver/tools/diagnose.py", line 145, in main
connectToCaldavd(keys)
File "/Applications/Server.app/Contents/ServerRoot/Library/CalendarServer/lib/python2.7/site-packages/calendarserver/tools/diagnose.py", line 584, in connectToCaldavd
url = "https://{host}/principals/".format(host=keys["ServerHostName"])
KeyError: 'ServerHostName'

CouchDB 1.6.1 on Windows 08 r2 - os_process_error exit status 1

I have an installation of couchdb which has been working well for a few weeks now. Today it started to throw an os_process_error exit status 1 when attempting to look at any view. The documents in the DB are very small and the views a quite simple. Total DB size is 20mb, largest document is 2mb, i have noticed that the ERL process pegs the CPU at 99%.
I've looked at:
CouchDB delay building index (CouchDB 1.5.0 on Windows Server 2008 R2)
Specific couchdb views suddenly start timing out
I've increased my time out to 50000 seconds, then lowered it to 500 to see if i could find the document which was killing everything but nothing shows up. Stale views still work as well.
Below is the debug error:
[Mon, 10 Nov 2014 19:22:19 GMT] [debug] [<0.118.0>] Successful cookie auth as: "sking"
[Mon, 10 Nov 2014 19:22:19 GMT] [info] [<0.118.0>] 192.168.247.158 - - GET /_config/native_query_servers/ 200
[Mon, 10 Nov 2014 19:22:19 GMT] [error] [<0.231.0>] OS Process Error <0.233.0> :: {os_process_error,
{exit_status,1}}
[Mon, 10 Nov 2014 19:22:19 GMT] [error] [emulator] Error in process <0.231.0> with exit value: {{nocatch,{os_process_error,{exit_status,1}}},[{couch_os_process,prompt,2,[{file,"c:/cygwin/relax/APACHE~2.1/src/couchdb/couch_os_process.erl"},{line,57}]},{couch_query_servers,map_doc_raw,2,[{file,"c:/cygwin/relax...
[Mon, 10 Nov 2014 19:22:19 GMT] [debug] [<0.117.0>] Minor error in HTTP request: {os_process_error,
{exit_status,1}}
[Mon, 10 Nov 2014 19:22:19 GMT] [debug] [<0.117.0>] Stacktrace: [{couch_mrview_util,get_view,4,
[{file,
"c:/cygwin/relax/APACHE~2.1/src/COUCH_~3/src/couch_mrview_util.erl"},
{line,49}]},
{couch_mrview,query_view,6,
[{file,
"c:/cygwin/relax/APACHE~2.1/src/COUCH_~3/src/couch_mrview.erl"},
{line,75}]},
{couch_httpd,etag_maybe,2,
[{file,
"c:/cygwin/relax/APACHE~2.1/src/couchdb/couch_httpd.erl"},
{line,610}]},
{couch_mrview_http,design_doc_view,5,
[{file,
"c:/cygwin/relax/APACHE~2.1/src/COUCH_~3/src/couch_mrview_http.erl"},
{line,188}]},
{couch_httpd_db,do_db_req,2,
[{file,
"c:/cygwin/relax/APACHE~2.1/src/couchdb/couch_httpd_db.erl"},
{line,234}]},
{couch_httpd,handle_request_int,5,
[{file,
"c:/cygwin/relax/APACHE~2.1/src/couchdb/couch_httpd.erl"},
{line,318}]},
{mochiweb_http,headers,5,
[{file,
"c:/cygwin/relax/APACHE~2.1/src/mochiweb/mochiweb_http.erl"},
{line,94}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,239}]}]
[Mon, 10 Nov 2014 19:22:19 GMT] [info] [<0.117.0>] 192.168.247.158 - - GET /tcs/_design/company/_view/Company_Id?limit=101 500
[Mon, 10 Nov 2014 19:22:19 GMT] [error] [<0.117.0>] httpd 500 error response:
{"error":"os_process_error","reason":"{exit_status,1}"}
I figured this out but I am not sure why it happened. There was a HUGE document 57mb which had been uploaded to the DB but it was not visible in Futon or anywhere else.
I only found it after digging into the debug log further. I could not access the document via a CURL Get. I ended up having to use CURL -X DELETE with the specific revision then a purge to get rid of the document. Soon as the document was deleted everything worked as expected.

Resources