Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Public Data Listing button gives internal server error in some cases #51

Closed
kvuppala opened this issue Nov 22, 2013 · 9 comments
Closed

Comments

@kvuppala
Copy link

loged into the inventory as ckan admin,
go to http://inventory.data.gov/organization/aphis-usda-gov and click on the "Public Data Listing" button generates an 500 internal server error.

@cew821
Copy link

cew821 commented Nov 22, 2013

I noticed yesterday (or maybe since the Tuesday release) that the creation of the JSON file was taking a lot longer than previously. I'm not sure why, but probably something different is happening behind the scenes.

The request eventually works for me, but I wouldn't be surprised if it's approaching a request timeout limit, and I'm only running it on 15 datasets so far.

@cew821
Copy link

cew821 commented Nov 22, 2013

I'm trying to think what has changed:

  • new build on Tuesday: was any different logic introduced to the json generator that we should check?
  • many orgs harvested lots of datasets, so the count of datasets in the database went up a lot (though is still small in absolute terms). If there was something slow in the query methods used in the json generator script, those might only show up as items were added to the database. Recommend you check those methods to ensure:
    • All the search criteria used to generate the json are searching columns that have indexes
    • You're eager loading associated records that are also used when running the query... for example if the query is on datasets, you should eager load resources so you only have to hit the database once, not n+1 for each dataset.

@gbinal
Copy link
Member

gbinal commented Nov 22, 2013

+1

I've noticed the same behavior and the same timing. @dwcaraway: any ideas? @kvuppala, we might need you or yours to find out how we get our hands on the logs for this.

@emakred
Copy link

emakred commented Nov 25, 2013

+1
seeing the same behavior on both the EDI and PDL, eventually it worked but not the first few tries on a few different browsers sessions

@dwcaraway
Copy link

@kvuppala @FuhuXia can I get the server logs when this error happens? If it's a 500, then there will be a stack trace in the logs.

@cew821
Copy link

cew821 commented Nov 26, 2013

Not sure if you guys are using a similar approach as this project, but you may want to check out this pull request:
HHS/ckanext-datajson#8

@dwcaraway
Copy link

Reviewing the logs, I'm seeing a recurring apache error log entry:

Nov 25 13:31:52 (omitted) apache_error_log: [Mon Nov 25 13:31:52 2013] [error] [client (omitted)] Error - <type 'exceptions.KeyError'>: 'public_access_level'
Nov 25 13:31:52 (omitted) apache_access_log: (omitted) - (omitted) [25/Nov/2013:13:30:32 -0500] "GET /organization/(omitted)/data.json HTTP/1.1" 500 12613 "i" "i""i""i" 

From reviewing the logs, every exceptions.KeyError occurrence was due to 'public_access_level' and all resulted in an HTTP 500 exception.

From looking at https://github.com/GSA/ckanext-datajson/blob/master/ckanext/datajson/plugin.py it appears that the below code snippet is the suspect:

re.match(r'[Nn]on-public', extras['public_access_level'])

Will fix by changing the 'public_access_level' to a get call with a default value of 'Public'

dwcaraway pushed a commit to GSA/ckanext-datajson that referenced this issue Dec 2, 2013
…eturning an HTTP 500 error when the datasets does not have a 'public_access_level' extras field. Modified so that if this extra isn't found, we default to public access level.
@dwcaraway
Copy link

On my local machine, I verified this fix by creating a dataset, then editing the dataset and deleting the 'public_access_level' field, then selecting the public data listing button and verifying that a data.json is rendered. I also tested by creating an organization with no datasets and verifying that both the public data listing and enterprise data inventory buttons render a data.json file.

this fix will deploy to staging and production on Tuesday, 3 December. I will close out the ticket once the fix is verified in staging.

@dwcaraway
Copy link

Fix is in production and appears to be working.

zr2d2 pushed a commit to HHS/ckanext-datajson that referenced this issue Dec 3, 2014
…eturning an HTTP 500 error when the datasets does not have a 'public_access_level' extras field. Modified so that if this extra isn't found, we default to public access level.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants