Skip to content

Instantly share code, notes, and snippets.

@JJetmar
Created June 15, 2023 09:10
Show Gist options
  • Select an option

  • Save JJetmar/a50ce24687c971790a406ace98bf6e93 to your computer and use it in GitHub Desktop.

Select an option

Save JJetmar/a50ce24687c971790a406ace98bf6e93 to your computer and use it in GitHub Desktop.
Aborting Apify acot
import { Actor } from 'apify';
import { CheerioCrawler } from 'crawlee';
import { router } from './routes.js';
await Actor.init();
const startUrls = ['https://apify.com'];
const proxyConfiguration = await Actor.createProxyConfiguration();
const crawler = new CheerioCrawler({
proxyConfiguration,
requestHandler: router,
});
void crawler.run(startUrls); // no await - means we don't wait for the run to finish.
await new Promise((resolve) => {
setTimeout(() => {
crawler.autoscaledPool.abort().then(resolve); // Aborting the run from outside the crawler
}, 5_000); // Waiting 5 sec
});
// Exit successfully
await Actor.exit();
import { Dataset, createCheerioRouter } from 'crawlee';
export const router = createCheerioRouter();
// ...
router.addHandler('detail', async ({ request, $, log, crawler }) => {
// ...
await crawler.abort(); // Aborting the crawler from inside the Actor
// ...
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment