2025-09-25 20:47:15 [scrapy.utils.log] INFO: Scrapy 2.11.1 started (bot: news_scraper) 2025-09-25 20:47:15 [scrapy.utils.log] INFO: Versions: lxml 6.0.0.0, libxml2 2.14.4, cssselect 1.3.0, parsel 1.10.0, w3lib 2.3.1, Twisted 25.5.0, Python 3.11.13 (main, Jul 15 2025, 19:29:01) [GCC 14.2.0], pyOpenSSL 25.1.0 (OpenSSL 3.5.1 1 Jul 2025), cryptography 45.0.5, Platform Linux-5.15.0-139-generic-x86_64-with 2025-09-25 20:47:15 [scrapy.addons] INFO: Enabled addons: [] 2025-09-25 20:47:15 [asyncio] DEBUG: Using selector: EpollSelector 2025-09-25 20:47:15 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor 2025-09-25 20:47:15 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.unix_events._UnixSelectorEventLoop 2025-09-25 20:47:15 [scrapy.extensions.telnet] INFO: Telnet Password: ce921782d4a16153 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-call.apigateway to before-call.api-gateway 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.route53 to before-parameter-build.route-53 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section 2025-09-25 20:47:15 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/endpoints.json 2025-09-25 20:47:15 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/sdk-default-configuration.json 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Event choose-service-name: calling handler 2025-09-25 20:47:15 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/service-2.json.gz 2025-09-25 20:47:15 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/endpoint-rule-set-1.json.gz 2025-09-25 20:47:15 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/partitions.json 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler ._handler at 0x7fd9d184fc40> 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler 2025-09-25 20:47:15 [botocore.endpoint] DEBUG: Setting s3 timeout as (60, 60) 2025-09-25 20:47:15 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/_retry.json 2025-09-25 20:47:15 [botocore.client] DEBUG: Registering retry handlers for service: s3 2025-09-25 20:47:15 [botocore.utils] DEBUG: Registering S3 region redirector handler 2025-09-25 20:47:15 [botocore.utils] DEBUG: Registering S3Express Identity Resolver 2025-09-25 20:47:15 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.memusage.MemoryUsage', 'scrapy.extensions.closespider.CloseSpider', 'scrapy.extensions.feedexport.FeedExporter', 'scrapy.extensions.logstats.LogStats', 'scrapy.extensions.throttle.AutoThrottle'] 2025-09-25 20:47:15 [scrapy.crawler] INFO: Overridden settings: {'AUTOTHROTTLE_ENABLED': True, 'BOT_NAME': 'news_scraper', 'CLOSESPIDER_TIMEOUT': 1800, 'CONCURRENT_REQUESTS': 4, 'DOWNLOAD_DELAY': 2, 'FEED_EXPORT_ENCODING': 'utf-8', 'LOG_FILE': '/opt/scrapyd/logs/news_scraper/saostar_timestamp/c5de51009a5011f086971e907748958e.log', 'NEWSPIDER_MODULE': 'news_scraper.spiders', 'REQUEST_FINGERPRINTER_IMPLEMENTATION': '2.7', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['news_scraper.spiders'], 'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'} 2025-09-25 20:47:15 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware', 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'news_scraper.middlewares.NewsScraperDownloaderMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2025-09-25 20:47:15 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2025-09-25 20:47:15 [scrapy.middleware] INFO: Enabled item pipelines: [] 2025-09-25 20:47:15 [scrapy.core.engine] INFO: Spider opened 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-call.apigateway to before-call.api-gateway 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.route53 to before-parameter-build.route-53 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search 2025-09-25 20:47:15 [botocore.hooks] DEBUG: Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section 2025-09-25 20:47:15 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/endpoints.json 2025-09-25 20:47:16 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/sdk-default-configuration.json 2025-09-25 20:47:16 [botocore.hooks] DEBUG: Event choose-service-name: calling handler 2025-09-25 20:47:16 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/service-2.json.gz 2025-09-25 20:47:16 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/s3/2006-03-01/endpoint-rule-set-1.json.gz 2025-09-25 20:47:16 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/partitions.json 2025-09-25 20:47:16 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler 2025-09-25 20:47:16 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler ._handler at 0x7fd9d0a74b80> 2025-09-25 20:47:16 [botocore.hooks] DEBUG: Event creating-client-class.s3: calling handler 2025-09-25 20:47:16 [botocore.endpoint] DEBUG: Setting s3 timeout as (60, 60) 2025-09-25 20:47:16 [botocore.loaders] DEBUG: Loading JSON file: /usr/local/lib/python3.11/site-packages/botocore/data/_retry.json 2025-09-25 20:47:16 [botocore.client] DEBUG: Registering retry handlers for service: s3 2025-09-25 20:47:16 [botocore.utils] DEBUG: Registering S3 region redirector handler 2025-09-25 20:47:16 [botocore.utils] DEBUG: Registering S3Express Identity Resolver 2025-09-25 20:47:16 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2025-09-25 20:47:16 [saostar_timestamp] INFO: Spider opened: saostar_timestamp 2025-09-25 20:47:16 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6024 2025-09-25 20:47:16 [scrapy.core.engine] DEBUG: Crawled (200) (referer: None) 2025-09-25 20:47:22 [scrapy.core.engine] DEBUG: Crawled (200) (referer: None) 2025-09-25 20:47:25 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/) 2025-09-25 20:47:28 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:28 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/giai-tri/jungkook-bts-gay-sot-voi-co-bap-cuon-cuon-202509251712201557.html 2025-09-25 20:47:31 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:31 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/vong-quanh-the-gioi/hai-san-phu-kin-bai-bien-sau-con-bao-nguoi-dan-do-xo-di-bat-202509251649535211.html 2025-09-25 20:47:32 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:32 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/dien-anh/emma-watson-len-tieng-giai-thich-vi-sao-roi-xa-man-anh-202509251725522331.html 2025-09-25 20:47:35 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:35 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/am-nhac/nam-ca-si-tuyen-bo-ket-hon-voi-em-gai-nuoi-sau-hon-30-nam-giau-kin-202509251615050081.html 2025-09-25 20:47:38 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:38 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sao-sport/world-cup-2030-co-64-doi-tuyen-viet-nam-cung-rat-kho-202509251258087764.html 2025-09-25 20:47:41 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:41 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/am-nhac/chen-va-xiumin-exo-khuay-dao-san-khau-chinh-tai-dai-nhac-hoi-quoc-te-202509240921087093.html 2025-09-25 20:47:43 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:43 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/bao-so-9-suy-yeu-bao-so-10-sap-vao-bien-dong-dem-26-9-202509251414555685.html 2025-09-25 20:47:45 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:45 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/an-choi-kham-pha/nhung-mon-an-ki-la-tren-the-gioi-202509251416326531.html 2025-09-25 20:47:47 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:47 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/clip-mua-duoc-qua-mit-ve-den-nha-thi-mat-co-gai-lien-check-camera-202509251442015258.html 2025-09-25 20:47:50 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:50 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/an-choi-kham-pha/2-loai-ca-tung-bi-ngo-lo-nay-thanh-dac-san-duoc-san-lung-202509251455360787.html 2025-09-25 20:47:53 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:53 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/kinh-doanh/gia-xang-dau-hom-nay-25-9-tang-cuc-manh-202509251306139804.html 2025-09-25 20:47:55 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:55 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/nguoi-mau-hoa-hau/thach-thuc-lon-voi-tan-hoa-hau-hoa-binh-viet-nam-202509251227229802.html 2025-09-25 20:47:58 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:47:58 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/giai-tri/vo-sieu-mau-binh-minh-tiet-lo-su-that-kho-ngo-khi-chong-bi-noi-gia-nua-202509251148052185.html 2025-09-25 20:48:01 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:01 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/am-nhac/tung-duong-he-lo-hinh-anh-thoi-qua-khu-hoi-do-tre-em-nhin-khoc-thet-202509241416278921.html 2025-09-25 20:48:02 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:02 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/nha-bao-lai-van-sam-canh-bao-202509251511458084.html 2025-09-25 20:48:05 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:05 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/dien-anh/nhan-sac-thanh-lich-cua-nu-dien-vien-tu-chien-tren-khong-202509251532307177.html 2025-09-25 20:48:07 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:07 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/hut-chat-thai-tu-be-phot-toa-nha-roi-do-xuong-song-to-lich-202509251539073677.html 2025-09-25 20:48:09 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:09 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sao-hoc-duong/nhieu-truong-dai-hoc-cong-bo-lich-nghi-tet-nguyen-dan-2026-202509251600375949.html 2025-09-25 20:48:12 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:12 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sao-sport/bao-thai-lan-noi-that-ve-vdv-bong-chuyen-dang-thi-hong-202509251546152171.html 2025-09-25 20:48:14 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:14 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/hinh-anh-ca-si-khanh-phuong-tai-truoc-tru-so-cong-an-tp-hcm-202509251614270367.html 2025-09-25 20:48:16 [scrapy.extensions.logstats] INFO: Crawled 23 pages (at 23 pages/min), scraped 0 items (at 0 items/min) 2025-09-25 20:48:17 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:17 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/be-gai-so-sinh-bi-bo-roi-xot-xa-tinh-canh-luc-phat-hien-202509251527531333.html 2025-09-25 20:48:20 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:20 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/song-khoe/5-loai-ca-bo-duong-nen-an-vao-mua-thu-202509251616215959.html 2025-09-25 20:48:22 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:23 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/giai-tri/nu-dien-vien-noi-tieng-bat-khoc-tai-toa-linh-an-treo-vi-tham-o-70-ty-202509251617569785.html 2025-09-25 20:48:25 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:25 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sao-sport/chuyen-tinh-ngot-ngao-cua-tien-ve-duc-huy-202509251618392994.html 2025-09-25 20:48:27 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:27 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/nguoi-mau-hoa-hau/a-hau-vbiz-am-tham-lam-le-dam-ngo-voi-ban-trai-chu-tich-202509251646308301.html 2025-09-25 20:48:30 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:30 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sao-sport/tin-kem-vui-voi-nguyen-xuan-son-202509251649147027.html 2025-09-25 20:48:32 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:32 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/vong-quanh-the-gioi/nguoi-phu-nu-co-hanh-dong-gay-phan-no-voi-con-trai-o-cau-thang-202509250958158769.html 2025-09-25 20:48:33 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:34 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/am-nhac/cu-ong-ban-nuoc-hon-10-tieng-de-kiem-tien-xem-concert-chau-kiet-luan-202509211500457302.html 2025-09-25 20:48:36 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:36 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sao-sport/chiec-ao-qua-rong-cua-nguyen-xuan-son-202509251728586733.html 2025-09-25 20:48:38 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:38 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/giai-tri/chieu-tro-cua-ngu-thu-han-gay-tranh-cai-202509251631212024.html 2025-09-25 20:48:41 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:41 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/xac-minh-viec-be-gai-bi-bo-bao-hanh-ep-con-bao-me-gui-tien-ve-202509251602522573.html 2025-09-25 20:48:43 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:43 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/dien-anh/kim-woo-bin-tai-xuat-ram-ro-hua-hen-bung-no-trong-thang-10-202509251629599395.html 2025-09-25 20:48:46 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:46 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/bang-hoang-khi-nhan-tin-nhan-cua-con-trai-bo-me-tuc-toc-bao-cong-an-202509251600597229.html 2025-09-25 20:48:48 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:48 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/giai-tri/nam-nghe-si-hai-bi-bat-vi-lai-xe-say-xin-luc-rang-sang-202509251154262676.html 2025-09-25 20:48:51 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:51 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/am-nhac/cho-doi-vao-huong-giang-202509251614238425.html 2025-09-25 20:48:54 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:54 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/am-nhac/bidv-music-festival-nong-ruc-ngay-ngay-dau-mo-cua-gioi-tre-do-ve-202509251653397271.html 2025-09-25 20:48:56 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:56 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/bao-tin-mat-xe-va-tien-bac-cong-an-vao-cuoc-phat-hien-su-that-bat-ngo-202509252123191253.html 2025-09-25 20:48:59 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:48:59 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sao-sport/nam-dinh-fc-gay-that-vong-du-dung-7-ngoai-binh-truoc-clb-campuchia-202509252150366013.html 2025-09-25 20:49:02 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:49:02 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sac-mau-cuoc-song/bao-so-10-bualoi-chuan-bi-vao-bien-dong-gio-giat-cap-15-202509252126261062.html 2025-09-25 20:49:05 [scrapy.core.engine] DEBUG: Crawled (200) (referer: https://www.saostar.vn/tin-moi/) 2025-09-25 20:49:05 [saostar_timestamp] INFO: 2025-09-25 is out of date range: from 2025-09-26 to 2025-09-26, skipping article: https://www.saostar.vn/sao-hoc-duong/bai-toan-11-4-7-bi-gach-sai-co-giao-dua-ra-dap-an-bang-11-202509252238276713.html 2025-09-25 20:49:05 [scrapy.core.engine] INFO: Closing spider (finished) 2025-09-25 20:49:05 [boto3.s3.transfer] DEBUG: Opting out of CRT Transfer Manager. Preferred client: auto, CRT available: False, Instance Optimized: False. 2025-09-25 20:49:05 [boto3.s3.transfer] DEBUG: Using default client. pid: 171368, thread: 140573482597176 2025-09-25 20:49:05 [s3transfer.utils] DEBUG: Acquiring 0 2025-09-25 20:49:05 [s3transfer.tasks] DEBUG: UploadSubmissionTask(transfer_id=0, {'transfer_future': }) about to wait for the following futures [] 2025-09-25 20:49:05 [s3transfer.tasks] DEBUG: UploadSubmissionTask(transfer_id=0, {'transfer_future': }) done waiting for dependent futures 2025-09-25 20:49:05 [s3transfer.tasks] DEBUG: Executing task UploadSubmissionTask(transfer_id=0, {'transfer_future': }) with kwargs {'client': , 'config': , 'osutil': , 'request_executor': , 'transfer_future': } 2025-09-25 20:49:05 [s3transfer.futures] DEBUG: Submitting task PutObjectTask(transfer_id=0, {'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'extra_args': {}}) to executor for transfer request: 0. 2025-09-25 20:49:05 [s3transfer.utils] DEBUG: Acquiring 0 2025-09-25 20:49:05 [s3transfer.tasks] DEBUG: PutObjectTask(transfer_id=0, {'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'extra_args': {}}) about to wait for the following futures [] 2025-09-25 20:49:05 [s3transfer.utils] DEBUG: Releasing acquire 0/None 2025-09-25 20:49:05 [s3transfer.tasks] DEBUG: PutObjectTask(transfer_id=0, {'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'extra_args': {}}) done waiting for dependent futures 2025-09-25 20:49:05 [s3transfer.tasks] DEBUG: Executing task PutObjectTask(transfer_id=0, {'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'extra_args': {}}) with kwargs {'client': , 'fileobj': , 'bucket': 'dagster-output-data', 'key': 'saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'extra_args': {}} 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler > 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler > 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-parameter-build.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-endpoint-resolution.s3: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-endpoint-resolution.s3: calling handler > 2025-09-25 20:49:05 [botocore.regions] DEBUG: Calling endpoint provider with parameters: {'Bucket': 'dagster-output-data', 'Region': 'us-east-1', 'UseFIPS': False, 'UseDualStack': False, 'Endpoint': 'https://lake-api.actable.ai/', 'ForcePathStyle': True, 'Accelerate': False, 'UseGlobalEndpoint': True, 'Key': 'saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'DisableMultiRegionAccessPoints': False, 'UseArnRegion': True} 2025-09-25 20:49:05 [botocore.regions] DEBUG: Endpoint provider result: https://lake-api.actable.ai/dagster-output-data 2025-09-25 20:49:05 [botocore.regions] DEBUG: Selecting from endpoint provider's list of auth schemes: "sigv4". User selected auth scheme is: "None" 2025-09-25 20:49:05 [botocore.regions] DEBUG: Selected auth type "v4" as "v4" with signing context params: {'region': 'us-east-1', 'signing_name': 's3', 'disableDoubleEncoding': True} 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.handlers] DEBUG: Adding expect 100 continue header to request. 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler > 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-call.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.endpoint] DEBUG: Making request for OperationModel(name=PutObject) with params: {'url_path': '/saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'query_string': {}, 'method': 'PUT', 'headers': {'User-Agent': 'Boto3/1.34.57 md/Botocore#1.34.162 ua/2.0 os/linux#5.15.0-139-generic md/arch#x86_64 lang/python#3.11.13 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.34.162', 'Content-MD5': '1B2M2Y8AsgTpgAmY7PhCfg==', 'Expect': '100-continue'}, 'body': , 'auth_path': '/dagster-output-data/saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'url': 'https://lake-api.actable.ai/dagster-output-data/saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'context': {'client_region': 'us-east-1', 'client_config': , 'has_streaming_input': True, 'auth_type': 'v4', 's3_redirect': {'redirected': False, 'bucket': 'dagster-output-data', 'params': {'Bucket': 'dagster-output-data', 'Key': 'saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl', 'Body': }}, 'input_params': {'Bucket': 'dagster-output-data', 'Key': 'saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl'}, 'signing': {'region': 'us-east-1', 'signing_name': 's3', 'disableDoubleEncoding': True}, 'endpoint_properties': {'authSchemes': [{'disableDoubleEncoding': True, 'name': 'sigv4', 'signingName': 's3', 'signingRegion': 'us-east-1'}]}}} 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event request-created.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event request-created.s3.PutObject: calling handler > 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event choose-signer.s3.PutObject: calling handler > 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event choose-signer.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-sign.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event before-sign.s3.PutObject: calling handler > 2025-09-25 20:49:05 [botocore.auth] DEBUG: Calculating signature using v4 auth. 2025-09-25 20:49:05 [botocore.auth] DEBUG: CanonicalRequest: PUT /dagster-output-data/saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl content-md5:1B2M2Y8AsgTpgAmY7PhCfg== host:lake-api.actable.ai x-amz-content-sha256:UNSIGNED-PAYLOAD x-amz-date:20250925T204905Z content-md5;host;x-amz-content-sha256;x-amz-date UNSIGNED-PAYLOAD 2025-09-25 20:49:05 [botocore.auth] DEBUG: StringToSign: AWS4-HMAC-SHA256 20250925T204905Z 20250925/us-east-1/s3/aws4_request 110b7ba4ace5d3426b540ee9980faeab4444fe7b3e01e1f2d99d4074452b9330 2025-09-25 20:49:05 [botocore.auth] DEBUG: Signature: 540fcae75fdc55ab46ecca31101d7d9de483519e3aef11383c55677bffae152b 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event request-created.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event request-created.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.endpoint] DEBUG: Sending http request: 2025-09-25 20:49:05 [botocore.httpsession] DEBUG: Certificate path: /usr/local/lib/python3.11/site-packages/certifi/cacert.pem 2025-09-25 20:49:05 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): lake-api.actable.ai:443 2025-09-25 20:49:05 [botocore.awsrequest] DEBUG: Waiting for 100 Continue response. 2025-09-25 20:49:05 [botocore.awsrequest] DEBUG: 100 Continue response seen, now sending request body. 2025-09-25 20:49:05 [urllib3.connectionpool] DEBUG: https://lake-api.actable.ai:443 "PUT /dagster-output-data/saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl HTTP/1.1" 200 0 2025-09-25 20:49:05 [botocore.parsers] DEBUG: Response headers: {'Server': 'nginx/1.24.0 (Ubuntu)', 'Date': 'Thu, 25 Sep 2025 20:49:05 GMT', 'Content-Length': '0', 'Connection': 'keep-alive', 'Accept-Ranges': 'bytes', 'ETag': '"d41d8cd98f00b204e9800998ecf8427e"', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains', 'Vary': 'Origin, Accept-Encoding', 'X-Amz-Bucket-Region': 'us-east-1', 'X-Amz-Id-2': 'dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8', 'X-Amz-Request-Id': '1868A19BDC797C69', 'X-Content-Type-Options': 'nosniff', 'X-Ratelimit-Limit': '25637', 'X-Ratelimit-Remaining': '25637', 'X-Xss-Protection': '1; mode=block'} 2025-09-25 20:49:05 [botocore.parsers] DEBUG: Response body: b'' 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event needs-retry.s3.PutObject: calling handler 2025-09-25 20:49:05 [botocore.retryhandler] DEBUG: No retry needed. 2025-09-25 20:49:05 [botocore.hooks] DEBUG: Event needs-retry.s3.PutObject: calling handler > 2025-09-25 20:49:05 [s3transfer.utils] DEBUG: Releasing acquire 0/None 2025-09-25 20:49:05 [scrapy.extensions.feedexport] INFO: Stored jsonlines feed (0 items) in: s3://dagster-output-data/saostar_timestamp/saostar_timestamp_c5de51009a5011f086971e907748958e_scheduled_2025-09-26.jl 2025-09-25 20:49:05 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 14606, 'downloader/request_count': 43, 'downloader/request_method_count/GET': 43, 'downloader/response_bytes': 455874, 'downloader/response_count': 43, 'downloader/response_status_count/200': 43, 'elapsed_time_seconds': 109.395078, 'feedexport/success_count/S3FeedStorage': 1, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2025, 9, 25, 20, 49, 5, 260231, tzinfo=datetime.timezone.utc), 'httpcompression/response_bytes': 1556782, 'httpcompression/response_count': 43, 'log_count/DEBUG': 153, 'log_count/INFO': 53, 'memusage/max': 128204800, 'memusage/startup': 124522496, 'request_depth_max': 2, 'response_received_count': 43, 'robotstxt/request_count': 1, 'robotstxt/response_count': 1, 'robotstxt/response_status_count/200': 1, 'scheduler/dequeued': 42, 'scheduler/dequeued/memory': 42, 'scheduler/enqueued': 42, 'scheduler/enqueued/memory': 42, 'start_time': datetime.datetime(2025, 9, 25, 20, 47, 15, 865153, tzinfo=datetime.timezone.utc)} 2025-09-25 20:49:05 [scrapy.core.engine] INFO: Spider closed (finished)