Skip to content

Script to build OP Curations and cache the output#1196

Draft
ksen0 wants to merge 2 commits intomainfrom
curation-script
Draft

Script to build OP Curations and cache the output#1196
ksen0 wants to merge 2 commits intomainfrom
curation-script

Conversation

@ksen0
Copy link
Member

@ksen0 ksen0 commented Feb 24, 2026

Relates to #1187

This is marked as a draft because it is not clear if a different solution might be possible. Alternatives that I am investigating with OP in the next week-ish:

  1. Use token + GH secret in build process and keep it during build process
  2. Switch to client-side calls (thanks @doradocodes for your input on this)

If keeping build process, then need a better approach; even if caching, this particular script generates a large amount of checked in json files which is not ideal.

Full disclosure on AI use: copilot was used for the script, since I was on a time crunch for the fix.

@davepagurek
Copy link
Collaborator

Going to take a more thorough look later but I think generally checking in the cached content is still the way to go, similar to how our other pages on the site that are derived from other data work. Possibly a future approach just to minimize the amount of data could be, like what we do with reference and contributor docs and such, is to have the script transform the data into a more directly usable format instead of storing it verbatim? e.g. just the titles and sketch size or something. But storing the full data does give us a bit more flexibility to change what we do with it, so it's also not a bad approach at all

@aashu2006
Copy link
Contributor

Caching at build time makes sense. Maybe we could reduce repo bloat by transforming the open-processing response and only storing the fields the site actually uses, also would a single consolidated op-curations.json file make updates cleaner than multiple generated files?

@msawired
Copy link

msawired commented Mar 3, 2026

Hello,

Following up on an email convo I had with @ksen0 on this.

Few notes: I will soon enforce BEARER tokens on API requests on OP, so I can imagine it will create some extra hassle on this repo to manage and pass around these tokens.
I also considered moving the requests on client-side, but this would probably just push the rate limiting issues to individuals and clog your client-side error logs in the future, such as when a school (same IP) is checking out the community section of the site.

Given that, it makes most sense to me to create a specific endpoint for you to pull all the necessary JSON information in a single shot. This could also be cached on my backend to prevent any resource clogging whenever multiple parties are building the website at the same time.

I will check on the code and try to extract the data you need, and I will bring this into a single call, such as:
oppr.org/webserv/p5_website_curation

You can also cache the output of this in case something goes off on OP site during a build and skip the data refresh on error (5XX, 4XX responses).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants