
Xueqiu User Posts Scraper
Pricing
$20.00/month + usage

Xueqiu User Posts Scraper
Comprehensive Xueqiu.com posts scraper for extracting valuable financial social media data from China's leading investment platform. Collect user posts, stock discussions, portfolio insights, and market sentiment data with detailed metadata for investment research and analysis.
0.0 (0)
Pricing
$20.00/month + usage
0
2
2
Last modified
4 days ago
Contact
If you encounter any issues or need to exchange information, please feel free to contact us through the following link: My profile
Xueqiu.com Posts Scraper - Automated Posts Extraction
Introduction
Xueqiu.com (雪球) stands as one of the most popular financial social networks among Chinese investors, serving as a critical hub for investment discussions, stock analysis, and portfolio sharing. Unlike general social networks, almost all information on Xueqiu is related to stocks, making it a natural data source for financial research and market sentiment analysis.
The need to scrape Xueqiu.com data has become increasingly important for financial analysts, researchers, and investment professionals who require access to authentic retail investor sentiment and professional financial analysis from China's market. Xueqiu provides information about stock markets in the Chinese mainland, Hong Kong and the US, along with data queries about stocks, funds and bonds, making it an invaluable resource for comprehensive market analysis.
Our Xueqiu.com Posts Scraper addresses the challenge of manually collecting this vast amount of financial social media data, enabling automated extraction of user posts, engagement metrics, and detailed metadata that can be used for sentiment analysis, trend identification, and investment decision-making.
Overview of Xueqiu.com Posts Scraper
The Xueqiu.com Posts Scraper is a sophisticated data extraction tool designed specifically to harvest posts and associated metadata from user profiles on Xueqiu.com. This scraper efficiently navigates through user timelines, collecting comprehensive post data including text content, engagement metrics, financial symbols, and user interaction data.
Key Features:
- Comprehensive Data Extraction: Captures over 80 different data fields per post
- User Profile Targeting: Focuses on specific user profiles for targeted data collection
- Engagement Metrics: Collects detailed interaction data including likes, retweets, and comments
- Financial Context: Extracts stock symbols, financial discussions, and investment-related metadata
- Scalable Architecture: Handles multiple user profiles simultaneously with configurable limits
Target Users:
- Financial analysts and researchers studying Chinese market sentiment
- Investment firms analyzing retail investor behavior
- Academic researchers in behavioral finance
- Market intelligence teams tracking social media influence on stock prices
- Quantitative analysts developing sentiment-based trading strategies
Input and Output Details
Input Configuration
Example url 1: https://xueqiu.com/u/1821992043
Example url 2: https://xueqiu.com/u/5584866422
Example url 3: https://xueqiu.com/u/5355205180
Example Screenshot of post information page:
The scraper accepts a JSON configuration with the following parameters:
{"max_retries_per_url": 2, // Maximum waiting time when accessing the links you provided."proxy": { // Add a proxy to ensure that during the data collection process, you are not detected as a bot."useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "SG" // You should choose an Country that coincides with the Country you want to collect data from},"max_items_per_url": 20, // Total page you want scrape"urls": [ // Links to detail pages."https://xueqiu.com/u/5355205180","https://xueqiu.com/u/5584866422","https://xueqiu.com/u/5355205180"]}
Input Parameters Explained:
- max_retries_per_url: Sets retry attempts when accessing URLs (recommended: 2-3)
- proxy settings: Essential for avoiding bot detection; residential proxies recommended
- apifyProxyCountry: Should match your target market (NL for Netherlands, BE for Belgium)
- max_items_per_url: Controls how many posts to scrape per profile page
- urls: Array of Xueqiu.com profile page URLs to scrape
Output:
The scraper delivers structured data with the following fields:
[ // List of post information{"id": 345575898,"user_id": 1821992043,"source": "雪球","title": "股息为盾,成长为矛","created_at": 1754361591000,"retweet_count": 30,"fav_count": 190,"truncated": false,"comment_id": null,"retweet_status_id": 0,"symbol_id": null,"description": "最近市场上关于红利股和成长股之间的讨论非常激烈。以强周期,顺周期,超跌股为主的吃息佬坚持高股息选股策略,对于题材股抱着隔岸观火的态度;以电子,医药,互联网为主的题材派,一百个瞧不上大烂臭。<br/>个人对这两个选股思路倒都不抵触。如果透过现象看本质,这两种选股思路其实都是现金流折...","type": "3","source_link": null,"edited_at": null,"user": {"subscribeable": false,"common_count": 0,"remark": "","recommend_reason": null,"domain": "","type": "1","province": "北京","city": "不限","gender": "m","status_count": 65138,"last_status_id": 345654788,"verified_description": "","blog_description": "","stocks_count": 162,"verified_type": 0,"allow_all_stock": false,"step": "three","intro": null,"recommend": null,"st_color": "1","followers_count": 316241,"friends_count": 181,"following": false,"follow_me": false,"stock_status_count": null,"screen_name": "ice_招行谷子地","location": "","description": "","verified": false,"profile": "/1821992043","id": 1821992043,"url": "","status": 1,"blocking": false,"donate_count": 9,"name": "","verified_infos": [{"data": [{"desc": "雪球2021年度十大影响力用户","icon": "https://xqimg.imedao.com/17e945e0336293fe4e96d403.png","link": "https://xueqiu.com/2552920054/209971707"},{"desc": "雪球十年知己","icon": "https://xqimg.imedao.com/176419ecbca1583fdb1c0fd9.png","link": "https://xueqiu.com/2552920054/164961444"},{"desc": "雪球2019年度十大影响力用户","icon": "https://xqimg.imedao.com/172fa6645e63f3fc38765696.png","link": ""},{"desc": "雪球2018年度十大内容贡献奖得主","icon": "https://xqimg.imedao.com/172fa65a7d93d3fd61adbc43.png","link": ""}],"verified_type": "10","verified_desc": "雪球2021年度十大影响力用户"},{"data": [],"verified_type": "3","verified_desc": "银行股研究达人"},{"data": [],"verified_type": "5","verified_desc": "用户已完成实名身份认证"}],"group_ids": null,"verified_realname": true,"name_pinyin": null,"screenname_pinyin": null,"photo_domain": "//xavatar.imedao.com/","live_info": {},"profile_image_url": "community/202010/1605018254892-1605018255073.jpeg,community/202010/1605018254892-1605018255073.jpeg!180x180.png,community/202010/1605018254892-1605018255073.jpeg!50x50.png,community/202010/1605018254892-1605018255073.jpeg!30x30.png","user_id": 1821992043},"retweeted_status": null,"answers": null,"rqtype": null,"rqid": 0,"target": "/1821992043/345575898","fragment": null,"blocked": false,"blocking": false,"topic_pic": null,"topic_symbol": null,"topic_title": null,"topic_desc": null,"donate_count": 0,"donate_snowcoin": 0,"liked": false,"view_count": 0,"weixin_retweet_count": 0,"mark": 5,"card": null,"favorited": false,"favorited_created_at": null,"offer": null,"score": null,"controversial": false,"fundx_hold": null,"tips": null,"stock_correlation": null,"time_before": null,"can_edit": null,"expend": true,"reward": true,"first_img": null,"raw_title": null,"tag_str": null,"long_text_for_ios": null,"topic_pic_thumbnail_small": null,"topic_pic_thumbnail": null,"topic_pic_head_or_pad": null,"tags_for_web": null,"reply_count": 356,"meta_keywords": "{\"post_position\":\"pc_home_post\",\"ip_location\":\"北京\",\"stockCorrelation\":\"BK2049_BK2415_BK0056_BK0055\",\"stockList\":\"BK2049_35,BK2415_35,BK0056_81,BK0055_81\"}","paid_mention": null,"reward_count": 0,"reward_amount": 0,"reward_user_count": 0,"talk_count": 0,"like_count": 370,"video_info": null,"quote_cards": null,"promotion_pic": null,"promotion_url": null,"promotion_id": 0,"mark_desc": null,"pic_sizes": [],"cover_pic_size": null,"order_id": 0,"tags": [],"status_industry": null,"excellent_comments": [],"notice_tag": null,"common_emotion": null,"current_stock_price": null,"new_card": null,"source_deep_link": null,"source_feed": false,"editable": true,"like_config": null,"title_ad_pic": null,"title_ad_url": null,"title_ad_deep_link": null,"answer_comment": null,"answer_count": 0,"answer_users": null,"bonus_screen_name": null,"fundx_tag": null,"extend_st_feed_info": {"stock_event_time": 0,"stock_event_type": null,"change_event_color": null,"is_future_event": false,"stock_event_id": null,"stock_event_message": null,"original_text_length": 1857,"disputed_status": false,"excellent_status": "false"},"stock_list": [{"symbol": "BK2049","type": "35"},{"symbol": "BK2415","type": "35"},{"symbol": "BK0056","type": "81"},{"symbol": "BK0055","type": "81"}],"reply_user_images": null,"reply_user_count": 0,"fundx_symbol": null,"hot_new_rank": null,"image_info_list": null,"extend_st_home_page": {"ip_location": "北京","st_mention_stock_count": 8,"ai_card": null},"forbidden_retweet": false,"pic": "","vod_info": null,"recommend_cards": null,"is_answer": false,"is_refused": false,"text": "<p>最近市场上关于红利股和成长股之间的讨论非常激烈。以强周期,顺周期,超跌股为主的吃息佬坚持高股息选股策略,对于题材股抱着隔岸观火的态度;以电子,医药,互联网为主的题材派,一百个瞧不上大烂臭。</p><p>个人对这两个选股思路倒都不抵触。如果透过现象看本质,这两种选股思路其实都是现金流折现的一种场景。其实如果你在股市的时间比较长,深入研究过各类估值方法,其背后本质都是 DCF。只不过不同的公司其成长曲线不同,在DCF下的表现自然不同。</p><p>对于成熟的强周期行业,其净利润可能表现为线性函数或者以线性函数为中轴的波动曲线。所以,其现金流折现的值是相对稳定或者变化不大的。投资者在投资此类公司的时候必然会在股价显著超过现金流折现价值的时候卖出,而在股价大幅低于现金流折现价值的时候买入。对于,普通散户更直观的理解就是低估买入,高估卖出,即高股息率的时候建仓,股息率降低了减仓。</p><p>对于成长性行业,其净利润可能表现为指数函数或者以指数函数为中轴的波动曲线。其现金流折现的值要远高于成熟行业,而且每过一年只要成长预期还能维持其折现价值还能明显提升。投资者在投资此类公司的时候对于股价相对于折现价值的溢价,容忍度就比较大。毕竟股价超过折现价值的部分很容易通过成长化解。</p><p>这两种选股思路各有优缺点。股息选股的方法,相对安全性较高,但是缺点是公司的长期持有复利效应不明显。比如最近两年特别火的<a href=\"https://xueqiu.com/S/SH601088?from=status_stock_match\" class=\"xq_stock\">中国神华</a>,虽然这两年涨幅不错但是如果你拉长时间看,其实现在的复权价也就刚刚超过2007年的高点不多。从2007年高点到2024年历史高点,17年的时间高点连线涨幅大概只有25%。但是,成长股就完全不同了,比如<a href=\"https://xueqiu.com/S/SZ002371?from=status_stock_match\" class=\"xq_stock\">北方华创</a>现在的复权股价是十几年前的20倍。这也就意味着,吃息佬需要不断择时高点抛售,低点建仓才能确保其收益保持较高水平。</p><p>成长选股的方法如果能够选中长期收益可观,但是缺点是成长性的确定性很低。多数成长性企业,其成长性多数体现在规模较小的时候。但是,规模越小的企业抗风险能力越弱。比如十几年前那一波<a href=\"https://xueqiu.com/S/SH000941?from=status_stock_match\" class=\"xq_stock\">新能源</a>行情,在前几年大干快上的时候,<a href=\"https://xueqiu.com/S/SZ002202?from=status_stock_match\" class=\"xq_stock\">金风科技</a>,华锐风电,无锡尚德,汉能等一大批龙头成长性很高,估值也被干到了令人瞠目结舌的地步。但是,2013年之后国家收紧货币,这批龙头就算不破产,股价也被腰斩了好几次。新的周期,光伏的龙头成了<a href=\"https://xueqiu.com/S/SH601012?from=status_stock_match\" class=\"xq_stock\">隆基绿能</a>,正所谓长江后浪推前浪,前浪死在沙滩上。</p><p>最近2年为何红利股的走势特别强劲?一个很重要的原因是我国的无风险收益率显著下行,推动了一大批原来配置债券的机构将资金配置向红利股倾斜,比如:保险资金和养老金。</p><p>我们以保险举例,我曾经在某个高考群里看到某位富爸爸给自己娃买的分红险,出生时500万一次缴清,18岁后每年返还38万一直到80岁。我自己算了一下,对应的收益率是年化3.8%。也就是说,保险公司收了客户500万,必须找到收益率超过3.8%的资产进行配置才能赚钱。</p><p>2007年的时候3.8%不算什么,那时候中国经济增长强劲,6%-8%回报的企业债都可以找到很多。但是,到了现在问题出现了,目前无风险的10年期国债收益率只有1.7%,企业债也很难超过3%。这种情况下,保险公司过去的高收益债券到期兑付后,资金很难找到3.8%的债券和客户的合约对应,这个问题如果不解决,那么就意味着保险公司在剩下的合约周期内都是亏钱的。这种情况我们称为利差损。当年日本泡沫破灭的时候,大量的寿险公司破产就是因为利差损。所以,保险公司逼不得已只能把大量资金买入分红稳定的红利股。</p><p>红利选股和成长选股各有优缺点,而且没有一个策略是永远最优的。在熊市中红利股的高收益分红可以起到托底作用,使得股价少跌甚至逆势上涨。但是到了牛市中,红利股可能因为缺乏题材和业绩增长平庸而遭人唾弃,最典型的例子就是<a href=\"https://xueqiu.com/S/SH600900?from=status_stock_match\" class=\"xq_stock\">长江电力</a>,非常典型的熊市牛股,牛市熊股。</p><p>正如我标题写的,股息可以作为投资之盾,降低风险,关键时候可以保命;成长可以作为投资之矛,获取更高的收益。正所谓:有矛无盾百战死,有盾无矛万年龟。单纯强调成长时间长了总有折戟沉沙的时候,只看股息的长期投资收益可能比较低。</p><p>投资者在股市里可以以股息为盾,成长为矛,要么选择在不同时段使用不同的策略,要么就是找一些能够兼顾股息率和成长性的标的。去年下半年我曾经通过给几只大热门红利股泼冷水的方式暗示股息策略见好就收。今年上半年我也提示过银行股一阶段行情接近尾声,注意切换选股思路。</p><p>目前看大金融板块和<a href=\"https://xueqiu.com/S/SH000997?from=status_stock_match\" class=\"xq_stock\">大消费</a>板块,依然兼具股息和成长的双重属性,特别是部分优质的银行股、保险股和白酒股,后市仍大有可为。</p><p>@<a href=\"https://xueqiu.com/n/今日话题\" target=\"_blank\">@今日话题</a> @<a href=\"https://xueqiu.com/n/雪球创作者中心\" target=\"_blank\">@雪球创作者中心</a> <a href=\"https://xueqiu.com/S/SH600036\" target=\"_blank\">$招商银行(SH600036)$</a> </p>","show_cover_pic": true,"legal_user_visible": false,"is_column": true,"is_ss_multi_pic": false,"is_bonus": false,"cover_pic": "","answer_question": false,"is_no_archive": false,"is_original_declare": true,"mp_not_show_status": false,"forbidden_comment": false,"is_private": false,"allow_reply": null,"toggle_action": null,"allow_reply_success": null,"from_url": "https://xueqiu.com/u/1821992043"}, // ... Many other post details]
The scraper produces comprehensive JSON output with detailed post metadata. Each extracted post contains extensive information across multiple categories:
Core Post Information:
- ID: Unique post identifier for database storage and reference
- User ID: Identifies the post author for user behavior analysis
- Title: Post headline for content categorization
- Text: Full post content for sentiment and topic analysis
- Created At: Timestamp for temporal analysis and trend identification
- Source: Platform attribution and content origin tracking
Engagement Metrics:
- Retweet Count: Measures post virality and reach
- Favorite Count: Indicates user appreciation and bookmark behavior
- Like Count: Primary engagement metric for popularity assessment
- View Count: Tracks post visibility and audience reach
- Comment ID & Reply Count: Measures discussion generation and community engagement
Financial Context:
- Symbol ID: Stock ticker associations for market correlation analysis
- Stock Correlation: Direct links between posts and specific securities
- Target: Investment targets mentioned in posts
- Stock List: Multiple securities referenced in single posts
Content Classification:
- Type: Post category (original, retweet, comment, analysis)
- Tags: Content categorization for filtered analysis
- Topic Title & Topic Description: Subject matter classification
- Meta Keywords: SEO and content discovery tags
User Interaction Data:
- Liked: Personal engagement status
- Favorited: Bookmark status with creation timestamp
- Blocking/Blocked: User relationship status
- Answer Count: Q&A interaction metrics
Media and Visual Content:
- Pic: Image attachments for visual content analysis
- Video Info: Multimedia content metadata
- Cover Pic: Visual presentation elements
- Image Info List: Comprehensive media cataloging
Platform-Specific Features:
- Weixin Retweet Count: Cross-platform sharing metrics
- Donate Count & Donate Snowcoin: Platform-specific engagement features
- Reward metrics: Monetization and recognition data
- Controversial: Content flagging for risk assessment
Each data point serves specific analytical purposes, from basic sentiment analysis using text and engagement metrics to sophisticated market correlation studies using financial symbols and temporal data.
Usage Guide
Step 1: Profile Selection Identify target Xueqiu user profiles by analyzing their posting frequency, follower count, and content relevance to your research objectives. Focus on profiles with consistent posting activity and substantial engagement.
Step 2: Configuration Setup Configure the input JSON with appropriate retry limits and proxy settings. For Chinese market data, consider using Asian proxy locations to maintain authentic access patterns.
Step 3: Data Collection Execute the scraper with your configured parameters. Monitor the collection process and adjust max_items_per_url based on your data requirements and rate limiting considerations.
Step 4: Data Processing Process the extracted data through your analysis pipeline. The comprehensive field structure allows for multiple analytical approaches from sentiment analysis to network analysis.
Best Practices:
- Implement respectful scraping practices with appropriate delays
- Use residential proxies to avoid detection
- Regularly update user profile lists to maintain data freshness
- Validate data quality by cross-referencing engagement metrics
Common Issues and Solutions:
- Rate Limiting: Adjust max_retries and implement progressive delays
- Profile Access: Ensure target profiles are public and accessible
- Data Completeness: Some fields may be empty depending on post type and user settings
Benefits and Applications
Time Efficiency: Automated collection replaces manual data gathering that would require thousands of hours for comprehensive datasets, enabling researchers to focus on analysis rather than data collection.
Research Applications: The extracted data supports various analytical approaches including sentiment analysis for market prediction, user behavior studies for understanding retail investor patterns, and social network analysis for identifying influence patterns in financial discussions.
Business Value: Investment firms can leverage this data for alternative data strategies, incorporating social sentiment into quantitative models, and identifying emerging market trends before they appear in traditional financial media.
Market Intelligence: The data provides insights into experienced retail investor interests and opinions toward thousands of A-share, HK and U.S. stocks, offering a unique window into Chinese market sentiment that's not available through traditional data sources.
Conclusion
The Xueqiu.com Posts Scraper provides essential infrastructure for accessing China's most important financial social media platform, delivering comprehensive data that supports sophisticated investment research and market analysis. With its detailed field extraction and scalable architecture, this tool empowers financial professionals to harness the collective intelligence of Chinese retail investors for informed decision-making.
Ready to unlock the insights hidden in China's premier financial social network? Start extracting valuable market sentiment data today with our comprehensive Xueqiu.com Posts Scraper.
Related Actors
Xueqiu Profile Details Scraper: Extract detailed Posts information from Xueqiu.com effortlessly with this powerful scraping tool
Your feedback
We are always working to improve Actors' performance. So, if you have any technical feedback about Xueqiu.com Posts Query Scraper or simply found a bug, please create an issue on the Actor's Issues tab in Apify Console.