Historics query stuck at 71%


#1

I started a historics query at 10:45 AM.
6 hours have passed.
For the last 2 hours, the historics/get shows the same progress percentage of 71%:

{
“id”: “60860047fcc47463c5c7”,
“definition_id”: “9b5dd445ab0b6cbeda831a41ba1f1534”,
“name”: “tweet_Ed_Stenson”,
“start”: 1365879600,
“end”: 1366398000,
“created_at”: 1366735065,
“status”: “running”,
“progress”: 71,
“sources”: [
“twitter”
],
“sample”: 100,
“volume_info”: {
“twitter”: 2679642635
},
“chunks”: [
{
“status”: “succeeded”,
“progress”: 100,
“start_time”: 1365879600,
“end_time”: 1365897600
},
{
“status”: “init”,
“progress”: 0,
“start_time”: 1365897600,
“end_time”: 1365984000
},
{
“status”: “init”,
“progress”: 0,
“start_time”: 1365984000,
“end_time”: 1366070400
},
{
“status”: “succeeded”,
“progress”: 100,
“start_time”: 1366070400,
“end_time”: 1366156800
},
{
“status”: “succeeded”,
“progress”: 100,
“start_time”: 1366156800,
“end_time”: 1366243200
},
{
“status”: “succeeded”,
“progress”: 100,
“start_time”: 1366243200,
“end_time”: 1366329600
},
{
“status”: “succeeded”,
“progress”: 100,
“start_time”: 1366329600,
“end_time”: 1366398000
}
]
}

I am not able to find the estimated completion time (mentioned in the documentation).
How do I find out what is going on?


#2

This job was not stuck - take a look at the 'status' field - it was still running at the time you made this API call. Historics jobs are broken down into day-long chunks, in this case, your job was broken down into seven chunks, each of which are processed independantly in a queue. There will often be cases where you have to wait several hours for one of your chunks to be processed, as the length of the job queue depends entirely on how many Historics jobs have been submitted by other users. 

The estimated completion time is estimated each time we finish processing one of your chunks, and simply gives an estimate as to what time this Historic job will be complete. This estimated time will change as other users submit new jobs to be processed.


#3

Jason, thank you for your reply.
Yes the job was not stuck, I got completion notification at 19:37.
I thought there is a bug, because 5 chunks completed, and 2 other chunks remained at 0% progress for about 4 hours.
I still have the same question: where should I look for the estimated completion time? Which attribute in historics/get output gives estimated completion time?


#4

Take a look at the /historics/get documentation - you can call /historics/get with the 'with_estimate' parameter, and it will return an estimated_completion field for each chunk, and the whole Historics job.

You can also see the estimated completion time when viewing your Tasks list in the DataSift site. 

Example output:

 

{  "estimated_completion": 1366915953,  "chunks": [    {      "estimated_completion": 1366819099,      "end_time": 1364083200,      "start_time": 1363996800,      "progress": 100,      "status": "succeeded"    },    {      "estimated_completion": 1366915953,      "end_time": 1366671600,      "start_time": 1366588800,      "progress": 0,      "status": "init"    }  ],  "sample": 100,  "sources": [    "twitter"  ],  "id": "75f3a4d73577e3ddde4a",  "definition_id": "0a1dec7dd367236375b679cb2c099162",  "name": "Historics Sample",  "start": 1363996800,  "end": 1366671600,  "created_at": 1366806658,  "status": "running",  "progress": 50}