Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[source-shopify] Product_images incremental not working right ? #45674

Open
1 task
SebastienCY opened this issue Sep 19, 2024 · 0 comments
Open
1 task

[source-shopify] Product_images incremental not working right ? #45674

SebastienCY opened this issue Sep 19, 2024 · 0 comments

Comments

@SebastienCY
Copy link

Connector Name

source-shopify

Connector Version

2.5.2

What step the error happened?

During the sync

Relevant information

Hello

We noticed an issue with product_images stream of Shopify connector.

We have a product dedicated connection wth products + product_variants + metafield_products + product_images streams selected, all of the four in "incremental | append" mode to a S3 destination. And most of time, succesive syncs end up syncing the same count of records which is not normal for an incremental sync.
image

After giving a quick look we observe:

  • The product_images query is expressed through products code here
  • But according the syncs logs the product_images requested period is behind the products stream one (same for metafield_products stream by the way)
Stream: `products` requesting BULK Job           for period: 2024-09-17T02:22:56+00:00 -- 2024-09-17T06:01:36.549059+00:00.
Stream: `product_images` requesting BULK Job     for period: 2024-09-16T15:43:17+00:00 -- 2024-09-17T06:01:38.717901+00:00.
Stream: `metafield_products` requesting BULK Job for period: 2024-09-16T17:23:10+00:00 -- 2024-09-17T06:02:17.968004+00:00.
Stream: `product_variants` requesting BULK Job   for period: 2024-09-16T16:32:47+00:00 -- 2024-09-17T06:02:19.807681+00:00.

So what most probably happens is :

  • If product_image are not updated for a few hours/days (so its cursor remains behind ; this is most probable because product_images are far less updated than products)
  • But the products are updated every minutes/hours (eg. for a inventory_quantity update when it is bought, or any other event that can update the products)
  • Then the product_images query runs starting from the last previously synchronized product_image.updated_at, but because applied through product graphql endpoint, gets back the product_images for all products updated since the last previous product_image update (which is some hours/days behind)
  • This ends up exporting the same product_images on every sync run until a product_image is updated and this stream cursor is incremented

Wdyt ?
Many thanks in advance

Relevant log output

No response

Contribute

  • Yes, I want to contribute
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants