Onedio Scraper avatar
Onedio Scraper
Try for free

3 days trial then $20.00/month - No credit card required now

View all Actors
Onedio Scraper

Onedio Scraper

epctex/onedio-scraper
Try for free

3 days trial then $20.00/month - No credit card required now

Actor - Onedio Scraper

Onedio is a popular digital media platform based in Turkey that offers a wide range of content, including news, entertainment, lifestyle, technology, and more. It combines editorial articles, user-generated content, quizzes, lists, and videos to engage a diverse audience. Onedio is known for its vibrant and interactive content, often tailored to a young, internet-savvy demographic. The platform aims to provide entertaining and informative content in a visually appealing format, leveraging social media to reach a broader audience.

  • Flexible Keyword Search: Seamlessly search for any keyword and retrieve articles. Pinpoint the information that you require by simply searching the keyword.

  • User-Centric Scraping: Dive deep into users' contributions by scraping articles made by specific individuals, gaining insights into their perspectives and contributions.

  • Category and User Articles Scraping: Fetch content from specific profiles, or gather articles related to specific categories right away!.

  • Retrieve Real Comment Information: Not just retrieving the articles, but also gather the comment information that has been generated by the community. Usernames, view counts, messages and more. All at your fingertips.

Bugs, fixes, updates, and changelog

This scraper is under active development. If you have any feature requests you can create an issue from here.

Input Parameters

The input of this scraper should be JSON containing the list of pages on Onedio that should be visited. Required fields are:

  • startUrls: (Optional) (Array) List of Onedio URLs. You should only provide a category, search, user profile, or article URLs.

  • search: (Optional) (String) Keyword that you want to search on Onedio.

  • endPage: (Optional) (Number) Final number of page that you want to scrape. The default is Infinite. This applies to all search requests and startUrls individually.

  • endPageForComments: (Optional) (Number) Final number of page that you want to scrape from comments. The default is Infinite. This applies to all article requests individually.

  • maxItems: (Optional) (Number) You can limit scraped items. This should be useful when you search through the big lists or search results.

  • proxy: (Required) (Proxy Object) Proxy configuration.

  • customMapFunction: (Optional) (String) Function that takes each object's handle as an argument and returns the object with executing the function.

This solution requires the use of Proxy servers, either your own proxy servers or you can use Apify Proxy.

Tip

When you want to scrape over a specific list URL, just copy and paste the link as one of the startUrl.

If you would like to scrape only the first page of a list then put the link for the page and have the endPage as 1.

With the last approach that is explained above you can also fetch any interval of pages. If you provide the 5th page of a list and define the endPage parameter as 6 then you'll have the 5th and 6th pages only.

Compute Unit Consumption

The actor is optimized to run blazing fast and scrape as many items as possible. Therefore, it forefronts all the detailed requests. If the actor doesn't block very often it'll scrape 100 listings in 1 minutes with ~0.03-0.04 compute units.

Onedio Scraper Input example

1{
2    "search": "mavi ceket",
3    "startUrls": [
4        "https://onedio.com//haber/su-aslinda-seffafken-okyanus-ve-denizler-neden-mavi-renktedir-1165941",
5        "https://onedio.com/ara/haber/ceket",
6        "https://onedio.com/profil/cometasincuerda",
7        "https://onedio.com/yemek/hamur-isi/pizza/2"
8    ],
9    "endPage": 3,
10    "maxItems": 100,
11    "includeComments": false,
12    "endPageForComments": 30,
13    "proxy": {
14        "useApifyProxy": true
15    }
16}

During the Run

During the run, the actor will output messages letting you know what is going on. Each message always contains a short label specifying which page from the provided list is currently specified.

When items are loaded from the page, you should see a message about this event with a loaded item count and total item count for each page.

If you provide incorrect input to the actor, it will immediately stop with a failure state and output an explanation of what is wrong.

Onedio Export

During the run, the actor stores results into a dataset. Each item is a separate item in the dataset.

You can manage the results in any language (Python, PHP, Node JS/NPM). See the FAQ or our API reference to learn more about getting results from this Onedio actor.

Scraped Onedio Properties

The structure of each article in Onedio looks like this:

Article Detail

1{
2  "type": "article",
3  "url": "https://onedio.com//haber/her-kosesine-gizlenmis-detaylarda-farkli-bir-atasozunun-bulundugu-muhtesem-tablo-mavi-pelerin-1007094",
4  "name": "Her Köşesine Gizlenmiş Detaylarda Farklı Bir Atasözünün Bulunduğu Muhteşem Tablo: Mavi Pelerin",
5  "id": "6159bca8be42440d22c5ba48",
6  "tags": [
7    "~experimental-videogifs"
8  ],
9  "stats": {
10    "views": 74047,
11    "shares": 459,
12    "comments": 4
13  },
14  "originalTitle": "Her Köşesine Gizlenmiş Detaylarda Farklı Bir Atasözünün Bulunduğu Muhteşem Tablo: Mavi Pelerin",
15  "legacyId": 1007094,
16  "keywords": [
17    "koses"
18  ],
19  "image": {
20    "format": "jpg",
21    "height": 628,
22    "width": 1200,
23    "thumbnail": null,
24    "url": "https://img-s1.onedio.com/id-6159be6e8a5bbc071672bb03/rev-0/w-1200/h-628/f-jpg/s-f070195baabcade6efc67f4876d779bae3e1d616.jpg"
25  },
26  "flags": {
27    "isDeleted": false,
28    "isDraft": false,
29    "isHidden": false,
30    "isTest": false,
31    "isVideo": false,
32    "noindex": false,
33    "showAds": true,
34    "showComments": true,
35    "showReactions": true,
36    "showSharesCount": true,
37    "showVideoPageDesign": false,
38    "showViewCount": true
39  },
40  "description": "Yüzyıllardır tartışmalara konu olan, gizemleri hala çözülemeyen binlerce sanat eseri artık internet sayesinde tek tıkla karşımıza geliyor. Biz de bizzat gezemediğimiz müzelerden değerli bir tabloyu incelemeye aldık: Bruegel'in 1559 tarihli 'Mavi Pelerin' eseri. 'Felemenk Atasözleri' olarak da bilinen tablonun en ince detaylarını sizler için inceledik. Buyrun içeriğe! 👇",
41  "createdAt": "03.10.2021 17:22",
42  "modifiedAt": "10.10.2021 17:03",
43  "publishedAt": "10.10.2021 17:03",
44  "updatedAt": "10.10.2021 17:03",
45  "categories": [
46    {
47      "_id": "62390d5f4c037f8216162711",
48      "name": "Genel Kültür",
49      "description": "Görsel sanatlar, tiyatro, sinema, sergi ve daha nice genel kültür haberleri Onedio'da. Türkiye ve dünya sanat haberleri ile ilgili gelişmeler için hemen tıkla!",
50      "color": "#053F5C",
51      "icons": {
52        "iconCode": "cafe",
53        "png": "https://static.onedio.com/icons/category/png/genel-kultur-v2.png"
54      },
55      "route": "/genel-kultur"
56    }
57  ],
58  "badges": [
59    {
60      "_id": "562e40f6df878a1246186f03",
61      "name": "Helal olsun!",
62      "description": "Helal olsun! ile ilgili tüm haberler, içerikler, galeriler, testler ve videolar Onedio’da. Helal olsun! ile ilgili son dakika haberleri ve gelişmelerini, yeni içerikleri de bu sayfa üzerinden takip edebilirsiniz.",
63      "metaDescription": "Helal olsun! hakkında son dakika haberleri Onedio’da. En çok konuşulan Helal olsun! haberlerini keşfetmek için hemen tıkla!",
64      "isBadge": true,
65      "icons": {
66        "svg": "https://static.onedio.com/icons/badge/svg/clap.svg",
67        "png": "https://static.onedio.com/icons/badge/png/clap.png"
68      },
69      "route": "/helal-olsun-haberleri"
70    }
71  ],
72  "author": [
73    {
74      "avatar": {
75        "format": "jpg",
76        "height": 1188,
77        "width": 1200,
78        "thumbnail": null,
79        "url": "https://img-s3.onedio.com/id-641f62a42b912a7a30ab8a68/rev-0/w-1200/h-1188/f-jpg/s-a9790c6120d2affb81d6bcad07f977dc1209aa07.jpg"
80      },
81      "city": "Ankara",
82      "country": "TR",
83      "route": "/profil/elalalolof",
84      "gender": "female",
85      "name": "Elif",
86      "title": "Onedio Üyesi",
87      "username": "elalalolof",
88      "id": "6055d6e149830c0326ead5c1",
89      "flags": [
90        "mail.surpress.newsletter"
91      ]
92    }
93  ],
94  "entries": [
95    {
96      "_id": "6159bca8be42440d22c5ba23",
97      "mode": "text",
98      "title": null,
99      "text": {
100        "plain": "Yüzyıllardır tartışmalara konu olan, gizemleri hala çözülemeyen binlerce sanat eseri artık internet sayesinde tek tıkla karşımıza geliyor. Biz de bizzat gezemediğimiz müzelerden değerli bir tabloyu incelemeye aldık: Bruegel'in 1559 tarihli 'Mavi Pelerin' eseri. 'Felemenk Atasözleri' olarak da bilinen tablonun en ince detaylarını sizler için inceledik. Buyrun içeriğe! 👇",
101        "html": "<p>Yüzyıllardır tartışmalara konu olan, gizemleri hala çözülemeyen binlerce sanat eseri artık internet sayesinde tek tıkla karşımıza geliyor. Biz de bizzat gezemediğimiz müzelerden değerli bir tabloyu incelemeye aldık: Bruegel'in 1559 tarihli 'Mavi Pelerin' eseri. 'Felemenk Atasözleri' olarak da bilinen tablonun en ince detaylarını sizler için inceledik. Buyrun içeriğe! 👇</p>"
102      },
103      "image": null,
104      "metadata": {
105        "fittoscreen": true,
106        "unordered": true,
107        "hideIn": {
108          "desktop": false,
109          "mobile": false
110        }
111      },
112      "internaldata": {},
113      "urls": {
114        "image": null,
115        "source": "https://artsandculture.google.com/asset/the-dutch-proverbs-pieter-bruegel-the-elder/WwG8mD89xbELbQ"
116      }
117    },
118  ]
119}

Contact

Please visit us through epctex.com to see all the products that are available for you. If you are looking for any custom integration or so, please reach out to us through the chat box in epctex.com. In need of support? devops@epctex.com is at your service.

Developer
Maintained by Community
Actor metrics
  • 1 monthly user
  • 0 stars
  • 100.0% runs succeeded
  • Created in Jul 2024
  • Modified about 23 hours ago