2026-03-21 22:30:32 [scrapy.utils.log] INFO: Scrapy 2.11.1 started (bot: news_scraper) 2026-03-21 22:30:32 [scrapy.utils.log] INFO: Versions: lxml 6.0.2.0, libxml2 2.14.6, cssselect 1.3.0, parsel 1.10.0, w3lib 2.3.1, Twisted 25.5.0, Python 3.11.13 (main, Aug 12 2025, 22:39:41) [GCC 14.2.0], pyOpenSSL 25.3.0 (OpenSSL 3.5.3 16 Sep 2025), cryptography 46.0.1, Platform Linux-5.15.0-164-generic-x86_64-with 2026-03-21 22:30:32 [scrapy.addons] INFO: Enabled addons: [] 2026-03-21 22:30:32 [asyncio] DEBUG: Using selector: EpollSelector 2026-03-21 22:30:32 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor 2026-03-21 22:30:32 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.unix_events._UnixSelectorEventLoop 2026-03-21 22:30:32 [scrapy.extensions.telnet] INFO: Telnet Password: d2915d892a76e96c 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from before-call.apigateway to before-call.api-gateway 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.route53 to before-parameter-build.route-53 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section 2026-03-21 22:30:32 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/endpoints.json 2026-03-21 22:30:32 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/sdk-default-configuration.json 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Event choose-service-name: calling handler 2026-03-21 22:30:32 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/service-2.json.gz 2026-03-21 22:30:32 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/endpoint-rule-set-1.json.gz 2026-03-21 22:30:32 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/partitions.json 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler ._handler at 0x7f1631fc0720> 2026-03-21 22:30:32 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler 2026-03-21 22:30:32 [botocore.endpoint] DEBUG: Setting s3 timeout as (60, 60) 2026-03-21 22:30:32 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/_retry.json 2026-03-21 22:30:32 [botocore.client] DEBUG: Registering retry handlers for service: s3 2026-03-21 22:30:32 [botocore.utils] DEBUG: Registering S3 region redirector handler 2026-03-21 22:30:32 [botocore.utils] DEBUG: Registering S3Express Identity Resolver 2026-03-21 22:30:32 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.memusage.MemoryUsage', 'scrapy.extensions.closespider.CloseSpider', 'scrapy.extensions.feedexport.FeedExporter', 'scrapy.extensions.logstats.LogStats', 'scrapy.extensions.throttle.AutoThrottle'] 2026-03-21 22:30:32 [scrapy.crawler] INFO: Overridden settings: {'AUTOTHROTTLE_ENABLED': True, 'BOT_NAME': 'news_scraper', 'CLOSESPIDER_TIMEOUT': 1800, 'CONCURRENT_REQUESTS': 4, 'DOWNLOAD_DELAY': 2, 'FEED_EXPORT_ENCODING': 'utf-8', 'LOG_FILE': '/opt/scrapyd/logs/news_scraper/saostar_timestamp/870fa2f2257511f1a8c68655d067ffdb.log', 'NEWSPIDER_MODULE': 'news_scraper.spiders', 'REQUEST_FINGERPRINTER_IMPLEMENTATION': '2.7', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['news_scraper.spiders'], 'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'} 2026-03-21 22:30:33 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware', 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'news_scraper.middlewares.NewsScraperDownloaderMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2026-03-21 22:30:33 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2026-03-21 22:30:33 [scrapy.middleware] INFO: Enabled item pipelines: [] 2026-03-21 22:30:33 [scrapy.core.engine] INFO: Spider opened 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from before-call.apigateway to before-call.api-gateway 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.route53 to before-parameter-build.route-53 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section 2026-03-21 22:30:33 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/endpoints.json 2026-03-21 22:30:33 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/sdk-default-configuration.json 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Event choose-service-name: calling handler 2026-03-21 22:30:33 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/service-2.json.gz 2026-03-21 22:30:33 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/endpoint-rule-set-1.json.gz 2026-03-21 22:30:33 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/partitions.json 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler ._handler at 0x7f163104d620> 2026-03-21 22:30:33 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler 2026-03-21 22:30:33 [botocore.endpoint] DEBUG: Setting s3 timeout as (60, 60) 2026-03-21 22:30:33 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/_retry.json 2026-03-21 22:30:33 [botocore.client] DEBUG: Registering retry handlers for service: s3 2026-03-21 22:30:33 [botocore.utils] DEBUG: Registering S3 region redirector handler 2026-03-21 22:30:33 [botocore.utils] DEBUG: Registering S3Express Identity Resolver 2026-03-21 22:30:33 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2026-03-21 22:30:33 [saostar_timestamp] INFO: Spider opened: saostar_timestamp 2026-03-21 22:30:33 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6026 2026-03-21 22:30:33 [scrapy.core.engine] DEBUG: Crawled (200) (referer: None) 2026-03-21 22:30:37 [scrapy.core.engine] DEBUG: Crawled (200) (referer: None) 2026-03-21 22:30:40 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/) 2026-03-21 22:30:42 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:30:42 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/deo-day-vang-2-nguoi-phu-nu-dong-nai-gap-phai-su-viec-kinh-hoang-202603212121096134.html 2026-03-21 22:30:45 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:30:45 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/ban-sau-rieng-tuong-gap-khach-sop-nguoi-phu-nu-phai-cau-cuu-cong-an-202603212127172322.html 2026-03-21 22:30:48 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:30:48 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/nen-uong-nuoc-theo-cam-giac-hay-theo-gio-co-dinh-cach-nao-tot-hon-202603212134002561.html 2026-03-21 22:30:50 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:30:50 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/4-con-giap-buoc-vao-chu-ky-an-nen-lam-ra-lam-it-huong-nhieu-202603212155247747.html 2026-03-21 22:30:53 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:30:53 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/am-nhac/chong-beyonce-chinh-thuc-doi-nghe-danh-202603211317053164.html 2026-03-21 22:30:55 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:30:55 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/dien-anh/het-hi-vong-voi-nsnd-hong-van-202603211532278797.html 2026-03-21 22:30:57 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:30:57 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/giai-tri/toc-tien-tha-dang-voi-do-boi-hai-manh-quyen-ru-202603211520049243.html 2026-03-21 22:31:00 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:00 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/dien-anh/khuong-ngoc-noi-ro-ly-do-bat-ngo-xuong-toc-202603211605470578.html 2026-03-21 22:31:03 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:03 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sao-sport/tin-chieu-21-3-fifa-da-dung-tuyen-viet-nam-phai-thay-doi-202603211620148512.html 2026-03-21 22:31:05 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:05 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/kinh-doanh/gia-vang-2026-khong-con-tang-soc-nhung-vi-sao-van-kho-giam-sau-202603211629197704.html 2026-03-21 22:31:08 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:08 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/an-choi-kham-pha/khi-ngoc-trinh-quoc-truong-goi-nho-vi-ngot-am-thuc-o-mien-tay-202603211642542312.html 2026-03-21 22:31:10 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:11 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/giai-tri/nu-ca-si-lee-young-hyun-lot-xac-giam-33kg-nhung-phai-bo-an-kieng-202603211331202866.html 2026-03-21 22:31:13 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:13 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/am-nhac/hinh-anh-gay-soc-cua-thien-hau-vuong-phi-o-san-bay-202603211524310403.html 2026-03-21 22:31:16 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:16 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/nguoi-mau-hoa-hau/loat-anh-ngot-ngao-cua-ky-duyen-va-thien-an-khi-di-du-lich-o-thai-lan-202603211634317779.html 2026-03-21 22:31:18 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:18 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/dien-anh/quynh-kool-dap-tra-khan-gia-khi-bi-noi-do-202603211702186874.html 2026-03-21 22:31:21 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:21 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/3-phan-cua-trung-nhieu-nguoi-an-nham-khong-phai-phan-nao-cung-tot-202603211741002818.html 2026-03-21 22:31:24 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:24 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sao-sport/tuyen-viet-nam-lo-sot-vo-sau-quyet-dinh-cua-fifa-202603211747595027.html 2026-03-21 22:31:27 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:27 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/an-choi-kham-pha/loai-ca-phat-ra-am-thanh-khi-bi-bat-len-chi-co-o-mien-tay-202603211751546695.html 2026-03-21 22:31:29 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:29 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/giai-tri/cat-phuong-vui-mung-truoc-khung-hinh-hoa-minzy-ben-ban-trai-va-quy-tu-202603211644491268.html 2026-03-21 22:31:31 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:31 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sao-sport/niem-vui-vo-dich-cho-nhat-ban-noi-dau-cua-australia-202603211801102828.html 2026-03-21 22:31:33 [scrapy.extensions.logstats] INFO: Crawled 23 pages (at 23 pages/min), scraped 0 items (at 0 items/min) 2026-03-21 22:31:34 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:34 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/dien-anh/dam-tung-van-chung-minh-dang-cap-bat-bai-202603211726092832.html 2026-03-21 22:31:35 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:35 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/dien-anh/tiec-nuoi-qua-lon-cho-nsnd-le-khanh-202603211859332575.html 2026-03-21 22:31:38 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:38 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/dien-anh/cu-truot-chan-dang-tiec-cua-phuong-anh-dao-va-tuan-tran-202603211859540628.html 2026-03-21 22:31:39 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:39 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/am-nhac/t-o-p-tai-xuat-sau-13-nam-g-dragon-co-dong-thai-gay-chu-y-202603211527282529.html 2026-03-21 22:31:41 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:42 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/giai-tri/thu-ky-tung-duoc-moi-an-toi-voi-so-tien-gan-4-ty-202603211511254226.html 2026-03-21 22:31:44 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:45 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/10-ngay-cuoi-thang-3-4-con-giap-but-pha-manh-me-tien-bac-ve-don-dap-202603211752480874.html 2026-03-21 22:31:47 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:47 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/am-nhac/qua-bat-ngo-ve-dieu-nhi-202603211528002992.html 2026-03-21 22:31:49 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:49 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sao-sport/ket-thuc-som-voi-hlv-vu-tien-thanh-202603212104172758.html 2026-03-21 22:31:52 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:52 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sao-hoc-duong/nu-sinh-mang-114-chi-vang-di-ban-muc-dich-dang-sau-gay-soc-202603212110545006.html 2026-03-21 22:31:54 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:54 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/de-tai-san-lon-tren-xe-nguoi-phu-nu-luc-tro-ra-kiem-tra-phai-bao-ca-202603212204094835.html 2026-03-21 22:31:57 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:57 [saostar_timestamp] INFO: 2026-03-21 is out of date range: from 2026-03-22 to 2026-03-22, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/bat-ke-gay-ra-noi-am-anh-tai-cac-nha-tro-o-da-nang-202603212212346583.html 2026-03-21 22:31:59 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:31:59 [saostar_timestamp] INFO: 2026-03-22 00:00:00 smaller than 2026-03-22 05:10:00 2026-03-21 22:32:01 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:32:01 [saostar_timestamp] INFO: 2026-03-22 00:00:00 smaller than 2026-03-22 05:10:00 2026-03-21 22:32:03 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:32:04 [saostar_timestamp] INFO: 2026-03-22 00:00:00 smaller than 2026-03-22 05:10:00 2026-03-21 22:32:06 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:32:06 [saostar_timestamp] INFO: 2026-03-22 04:30:00 smaller than 2026-03-22 05:10:00 2026-03-21 22:32:09 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:32:09 [saostar_timestamp] INFO: 2026-03-22 05:00:00 smaller than 2026-03-22 05:10:00 2026-03-21 22:32:11 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:32:11 [saostar_timestamp] INFO: 2026-03-22 05:00:00 smaller than 2026-03-22 05:10:00 2026-03-21 22:32:14 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:32:14 [saostar_timestamp] INFO: 2026-03-22 05:02:13.017000 smaller than 2026-03-22 05:10:00 2026-03-21 22:32:16 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:32:16 [saostar_timestamp] INFO: 2026-03-22 05:03:18.680000 smaller than 2026-03-22 05:10:00 2026-03-21 22:32:18 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2026-03-21 22:32:18 [scrapy.core.scraper] ERROR: Spider error processing (referer: https://www.saostar.vn/tin-moi/) Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/scrapy/utils/defer.py", line 279, in iter_errback yield next(it) ^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/scrapy/utils/python.py", line 350, in __next__ return next(self.data) ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/scrapy/utils/python.py", line 350, in __next__ return next(self.data) ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/scrapy/core/spidermw.py", line 106, in process_sync for r in iterable: File "/usr/local/lib/python3.11/site-packages/scrapy/spidermiddlewares/offsite.py", line 28, in return (r for r in result or () if self._filter(r, spider)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/scrapy/core/spidermw.py", line 106, in process_sync for r in iterable: File "/usr/local/lib/python3.11/site-packages/scrapy/spidermiddlewares/referer.py", line 352, in return (self._set_referer(r, response) for r in result or ()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/scrapy/core/spidermw.py", line 106, in process_sync for r in iterable: File "/usr/local/lib/python3.11/site-packages/scrapy/spidermiddlewares/urllength.py", line 27, in return (r for r in result or () if self._filter(r, spider)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/scrapy/core/spidermw.py", line 106, in process_sync for r in iterable: File "/usr/local/lib/python3.11/site-packages/scrapy/spidermiddlewares/depth.py", line 31, in return (r for r in result or () if self._filter(r, response, spider)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/scrapy/core/spidermw.py", line 106, in process_sync for r in iterable: File "/opt/scrapy_projects/news_scraper/spiders/saostar_timestamp_spider.py", line 90, in parse_article if item["content"]: ~~~~^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/scrapy/item.py", line 79, in __getitem__ return self._values[key] ~~~~~~~~~~~~^^^^^ KeyError: 'content' 2026-03-21 22:32:18 [scrapy.core.engine] INFO: Closing spider (finished) 2026-03-21 22:32:18 [boto3.s3.transfer] DEBUG: Opting out of CRT Transfer Manager. Preferred client: auto, CRT available: False, Instance Optimized: False. 2026-03-21 22:32:18 [boto3.s3.transfer] DEBUG: Using default client. pid: 24984, thread: 139733287238456 2026-03-21 22:32:18 [s3transfer.utils] DEBUG: Acquiring 0 2026-03-21 22:32:18 [s3transfer.tasks] DEBUG: UploadSubmissionTask(transfer_id=0, {'transfer_future': }) about to wait for the following futures [] 2026-03-21 22:32:18 [s3transfer.tasks] DEBUG: UploadSubmissionTask(transfer_id=0, {'transfer_future': }) done waiting for dependent futures 2026-03-21 22:32:18 [s3transfer.tasks] DEBUG: Executing task UploadSubmissionTask(transfer_id=0, {'transfer_future': }) with kwargs {'client': , 'config': , 'osutil': , 'request_executor': , 'transfer_future': } 2026-03-21 22:32:18 [s3transfer.futures] DEBUG: Submitting task PutObjectTask(transfer_id=0, {'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'extra_args': {}}) to executor for transfer request: 0. 2026-03-21 22:32:18 [s3transfer.utils] DEBUG: Acquiring 0 2026-03-21 22:32:18 [s3transfer.tasks] DEBUG: PutObjectTask(transfer_id=0, {'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'extra_args': {}}) about to wait for the following futures [] 2026-03-21 22:32:18 [s3transfer.utils] DEBUG: Releasing acquire 0/None 2026-03-21 22:32:18 [s3transfer.tasks] DEBUG: PutObjectTask(transfer_id=0, {'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'extra_args': {}}) done waiting for dependent futures 2026-03-21 22:32:18 [s3transfer.tasks] DEBUG: Executing task PutObjectTask(transfer_id=0, {'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'extra_args': {}}) with kwargs {'client': , 'fileobj': , 'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'extra_args': {}} 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler > 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler > 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-endpoint-resolution.s3: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-endpoint-resolution.s3: calling handler > 2026-03-21 22:32:18 [botocore.regions] DEBUG: Calling endpoint provider with parameters: {'Bucket': 'dagster-output-data', 'Region': 'us-east-1', 'UseFIPS': False, 'UseDualStack': False, 'Endpoint': 'https://lake-api.actable.ai/', 'ForcePathStyle': True, 'Accelerate': False, 'UseGlobalEndpoint': True, 'Key': 'saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'DisableMultiRegionAccessPoints': False, 'UseArnRegion': True} 2026-03-21 22:32:18 [botocore.regions] DEBUG: Endpoint provider result: https://lake-api.actable.ai/dagster-output-data 2026-03-21 22:32:18 [botocore.regions] DEBUG: Selecting from endpoint provider's list of auth schemes: "sigv4". User selected auth scheme is: "None" 2026-03-21 22:32:18 [botocore.regions] DEBUG: Selected auth type "v4" as "v4" with signing context params: {'region': 'us-east-1', 'signing_name': 's3', 'disableDoubleEncoding': True} 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.handlers] DEBUG: Adding expect 100 continue header to request. 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler > 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.endpoint] DEBUG: Making request for OperationModel(name=PutObject) with params: {'url_path': '/saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'query_string': {}, 'method': 'PUT', 'headers': {'User-Agent': 'Boto3/1.34.57 md/Botocore#1.34.162 ua/2.0 os/linux#5.15.0-164-generic md/arch#x86_64 lang/python#3.11.13 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.34.162', 'Content-MD5': '1B2M2Y8AsgTpgAmY7PhCfg==', 'Expect': '100-continue'}, 'body': , 'auth_path': '/dagster-output-data/saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'url': 'https://lake-api.actable.ai/dagster-output-data/saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'context': {'client_region': 'us-east-1', 'client_config': , 'has_streaming_input': True, 'auth_type': 'v4', 's3_redirect': {'redirected': False, 'bucket': 'dagster-output-data', 'params': {'Bucket': 'dagster-output-data', 'Key': 'saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl', 'Body': }}, 'input_params': {'Bucket': 'dagster-output-data', 'Key': 'saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl'}, 'signing': {'region': 'us-east-1', 'signing_name': 's3', 'disableDoubleEncoding': True}, 'endpoint_properties': {'authSchemes': [{'disableDoubleEncoding': True, 'name': 'sigv4', 'signingName': 's3', 'signingRegion': 'us-east-1'}]}}} 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event request-created.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event request-created.s3.PutObject: calling handler > 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event choose-signer.s3.PutObject: calling handler > 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event choose-signer.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-sign.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event before-sign.s3.PutObject: calling handler > 2026-03-21 22:32:18 [botocore.auth] DEBUG: Calculating signature using v4 auth. 2026-03-21 22:32:18 [botocore.auth] DEBUG: CanonicalRequest: PUT /dagster-output-data/saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl content-md5:1B2M2Y8AsgTpgAmY7PhCfg== host:lake-api.actable.ai x-amz-content-sha256:UNSIGNED-PAYLOAD x-amz-date:20260321T223218Z content-md5;host;x-amz-content-sha256;x-amz-date UNSIGNED-PAYLOAD 2026-03-21 22:32:18 [botocore.auth] DEBUG: StringToSign: AWS4-HMAC-SHA256 20260321T223218Z 20260321/us-east-1/s3/aws4_request 2873c3c360044ba84612aefb720aab1c96c3152703412e39be2b97f44d905867 2026-03-21 22:32:18 [botocore.auth] DEBUG: Signature: 2f565e5fbf13e8bd6c610e4d289f0a76466faaee75df16f06866be4a0a9b5b6c 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event request-created.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event request-created.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.endpoint] DEBUG: Sending http request: 2026-03-21 22:32:18 [botocore.httpsession] DEBUG: Certificate path: /usr/local/lib/python3.11/site-packages/certifi/cacert.pem 2026-03-21 22:32:18 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): lake-api.actable.ai:443 2026-03-21 22:32:18 [botocore.awsrequest] DEBUG: Waiting for 100 Continue response. 2026-03-21 22:32:18 [botocore.awsrequest] DEBUG: 100 Continue response seen, now sending request body. 2026-03-21 22:32:18 [urllib3.connectionpool] DEBUG: https://lake-api.actable.ai:443 "PUT /dagster-output-data/saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl HTTP/1.1" 200 0 2026-03-21 22:32:18 [botocore.parsers] DEBUG: Response headers: {'Server': 'nginx/1.18.0 (Ubuntu)', 'Date': 'Sat, 21 Mar 2026 22:32:18 GMT', 'Content-Length': '0', 'Connection': 'keep-alive', 'Accept-Ranges': 'bytes', 'ETag': '"d41d8cd98f00b204e9800998ecf8427e"', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains', 'Vary': 'Origin, Accept-Encoding', 'X-Amz-Bucket-Region': 'us-east-1', 'X-Amz-Id-2': 'dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8', 'X-Amz-Request-Id': '189EFBF64E26B7BC', 'X-Content-Type-Options': 'nosniff', 'X-Ratelimit-Limit': '3162', 'X-Ratelimit-Remaining': '3162', 'X-Xss-Protection': '1; mode=block'} 2026-03-21 22:32:18 [botocore.parsers] DEBUG: Response body: b'' 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event needs-retry.s3.PutObject: calling handler 2026-03-21 22:32:18 [botocore.retryhandler] DEBUG: No retry needed. 2026-03-21 22:32:18 [botocore.hooks] DEBUG: Event needs-retry.s3.PutObject: calling handler > 2026-03-21 22:32:18 [s3transfer.utils] DEBUG: Releasing acquire 0/None 2026-03-21 22:32:18 [scrapy.extensions.feedexport] INFO: Stored jsonlines feed (0 items) in: s3://dagster-output-data/saostar_timestamp/saostar_timestamp_870fa2f2257511f1a8c68655d067ffdb_scheduled_2026-03-22.jl 2026-03-21 22:32:18 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 14462, 'downloader/request_count': 43, 'downloader/request_method_count/GET': 43, 'downloader/response_bytes': 471542, 'downloader/response_count': 43, 'downloader/response_status_count/200': 43, 'elapsed_time_seconds': 105.321184, 'feedexport/success_count/S3FeedStorage': 1, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2026, 3, 21, 22, 32, 18, 373489, tzinfo=datetime.timezone.utc), 'httpcompression/response_bytes': 1719525, 'httpcompression/response_count': 43, 'log_count/DEBUG': 153, 'log_count/ERROR': 1, 'log_count/INFO': 52, 'memusage/max': 129081344, 'memusage/startup': 124702720, 'request_depth_max': 2, 'response_received_count': 43, 'robotstxt/request_count': 1, 'robotstxt/response_count': 1, 'robotstxt/response_status_count/200': 1, 'scheduler/dequeued': 42, 'scheduler/dequeued/memory': 42, 'scheduler/enqueued': 42, 'scheduler/enqueued/memory': 42, 'spider_exceptions/KeyError': 1, 'start_time': datetime.datetime(2026, 3, 21, 22, 30, 33, 52305, tzinfo=datetime.timezone.utc)} 2026-03-21 22:32:18 [scrapy.core.engine] INFO: Spider closed (finished)