普通视图

发现新文章,点击刷新页面。
昨天以前informal

How AI Will Change the Mobile Ecosystem

作者 Benson
2026年2月24日 00:00
How mobile development will be next year There are many mobile development platforms for non-technical users nowadays, like Rork and Lovable. They are not perfect now but will become much better in a year, just like how ChatGPT changed in the year after it was published, and like how much vibe coding agents have progressed in the last year. These platforms can generate everything we need to submit to the App Store and Google Play Store, like app packages, screenshots, privacy policies, and descriptions. Maybe next year, everyone can build and publish one app in 3 hours. I don’t mean a professionally designed app like Photoshop or Figma, but it’s very possible for a standard music app or management app. Z-Library & Copyright Many users use Z-Library to download books rather than paying for them, but they reject it when someone wants to clone their app. What is the reason? I think it’s because we’re developers, and we make up the majority of voices on the internet. We take advantage of Z-Library, but we lose our advantage with an application copy. This means copying applications will become more like Z-Library if we ignore our attitude. Moreover, how do we define copyright for an app? Unlike a book, it’s hard to identify whether an app violates copyright or not. Based on the two arguments above, I think copying will happen more in the future; some users will choose the cheaper ones because of the price. Media news Another example is transferring from newspapers to UGC platforms. Nowadays, we obtain information from lots of websites other than official news media, like Hacker News, Twitter, and personal blogs. It takes zero cost. Oh, it’s not zero—we have to pay for cloud costs through advertisements. It’s not much, just like the cost to build a new app next year. Nowadays, there is too much information generated by non-professional writers, and most users read information from them over professional news media. New media didn’t kill news, but made the ecosystem richer. It might be the same in the future for applications. Better market: Games We usually use apps to solve a problem; it’s different from reading books. We usually don’t want to explore new apps when we don’t have a demand. Books are more like games—we like to explore new games for fun. So, I think AI will have more influence on games, like other creativity fields. Summary AI will bring strong creativity to the app ecosystem. The costs will become lower for users, and the market will become more competitive for creators and the ecosystem will become richer for stores.

Introduction of Fraud detection

作者 Benson
2025年9月15日 00:00
In short, Target of fraud detection is to detect fradulent traffic and filtering or block them. What mekes this interesting and difficult are what we are detecting are human, it might be as hard as you can imagine to defense with huamn. Strategy Rules Most of the traffic are produced by pruely machine/device without any human involved, like Ddos. Rules usually can handle these kind of fraudulent traffic because they are too obvious. The core ideas to design a rule is to fine a dimension and feature, like count of traffic from same IP. What we need is find a proper dimension and valid features for current case. Models After we block the huge amount of fraudulent traffic, fraud rings usually will add more human-liked feature into the traffic, or create the behaviour totally manually. What they care is RIO, just like what we cared. If the profit is high enough for the manual operation time, it will worth. We need more features and signals to detect fraudulent traffic when they are more likely produced by human, that’s why we need statistic algorithm and machine learning models to detect. Anomaly deteciton Linear regression Tree-base models Graph-base models Deep learning models Mixture of above fraud similation We can simulate the fraud rings to attack our business, which can improve anti-fraud quality help use to estimate the cost to break anti-fraud productions. architecture Online Online Service usually reponse result within 50ms, errors in online product might block all of users of bussiness, and it some times really happens becuase it is designed to blcok users. So we should make sure we don’t give false positive result to business. Streaming We need streaming system to calculate faeture values, these feature will be used in rules and models. Offline Some complex algorithm cannot be implemented in online and streaming system, we can keep high-productivity iteration in offline stage. We also can try various different methods without considering efficiency. In addition, we can detect the hardest fraudulent traffic in offline stage, which is benefitical for online model training. Risk alert Risk alert can prevent from false negative samples. We can reduce the threshold of online rules and models to create risk alert methods, it’s also a great method to apply low-accuracy algorithms. Monitor service Monitor the traffic that we judge fraudulent, prevent from false positive. Montiro and analysis feedback from business, the have more clear view about these traffic. Summary Fradu detection is a mixture area that need engineering, statistics, algorithms and security, it’s a very interesting topic if you like it. It will be more challenging influenced by AI because there are less and less difference between huamn and AI.

LLM Post-Training experience

作者 Benson
2025年6月23日 00:00
Prompt Prompt is the most direct way to influence response, tips for good prompt: Clear instruction about our demand Provide necessary context, role, tone, format guide LLM output reasoning process before final answer More instructions, less constraints Exampler can ensure the constructure is as same as example The purpose of prompt in post-training is building best reasoning architecture in response, training could optimize other detailed contents in response One shot learning One example in prompt (one shot) can ensure output architecture is as same as example. In binary-class tasks, one example probably result in answer trend to that in example. In binary-class tasks, two examples probably result in unstable of answer. Experience / Conclusion Model size of model to train is related with information volume of datasets Larger model need more information volum to fine-tune We can use small-size model to test whether the solution is feasible with low cost Smaller model has better stability of response Amount of data is positively correlated with model performance Quality of data is positively correlated with model performance Training process The purpose of training is to ensure performance on test dataset increase in stable trend and range. ensure the loss/reward curve and performance on test dataset change with same trend If performance of test dataset don’t increase as expected, overfitting / reward hacking occur. If loss cannot reduce as expected, there is something wrong in training dataset adjust learning-rate and regularization penalty by observing loss curve with training steps If loss decreased slowly, raise LR. If loss curve is unstable, lower LR. When overfitting occur, raise regularization penalty. If loss can not increase in late stages, try to lower it. verify idea with pure control experiment retry total same experiment to exclude influence of random make LLM output intermediate reasoning process before output final answer For specific task, put as much logit as in rule rather than in prompt if possible Thinking rewrad is valid and necessary in RL model reward even multi model reward is helpful in RL Multi-stage training The purpose of dataset is to provide information to model to learn, in the late stages, model already know more than before, more extra information should be sent to model. So, in the late stage, we should increase information diversity How to increase information diversity: put hard samples in late training stages increase temperature in late training stages for GRPO select samples which have unstable results for GRPO Reference Google prompt engineering Six Key Elements of AI Agent Prompt Engineering

Papers I readed recently about LLM application

作者 Benson
2025年6月22日 00:00
How much do LLM memorize? key definition unintended memorization: memorize a specific dataset generalization (intended memorization): contains about the true data-generation process calculation method: by information entropy and mutual information double desent appear on the changing points from unintended memorization into generalization GPT-models store 3.6bits data per parameters value of float32 is 9% higher than float16 Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters trade-off between pre-train model size and inference-time(inference length) performance can ouperform 14x size model performance is better in easy and medium problem, judge easy, medium or hard question based on the pass rate two ways to increase inference-compute-time best-of-N: sample N outputs parallel and choose the best one based on learned verifier or reward function. revise response: revise original response Prolonged Reinforcement Learning tempurature: increase tempurate to avoid entropy collapse decoupled clip to increase exploration space dynamic sampling: erase all truely right or wrong samples calculate loss from sample level into token level – DAPO KL-regularization and reference model reset illusion of thinking for hanio tasks lower performance in simple-level question for reasoning model than general mdoel because it get wrong answer when thinking even already get correct answer [over thinking] better performance in medium-level question zero-performance in hard question Gemini 2.5 tech report dataset ensure dataset quality fliter and drop duplicates post-training verifiable reward and model-base generative rewards to provide sophisticated and scaleable feedback signals verifiable reward model-base: more sophisticated and scaleable feedback signals update LR-method to improve the stability of training result: learning in complex space

Weekly-#4 First insight of LLM accelerate

作者 Benson
2024年9月15日 00:00
Product process PopTranslate Finally find the root reason why my chrome extension was rejected, I put the api_key in .gitignore file and it can’t upload to chrome extension store successfully when I upload the zip file. Actually I found the root reason thought the crx file generated by chrome extension. After I pointed out this reason, google extension team also responce to my email and tell me this reason, this make me feel valued. I also published this product on several media platforms, what left me a deep impression was reddit, I put my project so hard on this platforms, as a result, it banned my account after five hours, I deserve it. So I register another account, I need to culture it for several days to express myself more free. The result after publishing haven’t totally show, so, I will summary that in the next week. unsloth Learn “Fast Cross Entropy Loss” in unsloth blog, I already undertstood the theory. I also need to porve that based on my own code. This most important thing I have learnt in this week is I finally understand why llm-accelerate projects can work. Most of code of llm design and writed for common ustage, right now, loss function or transformers are used for million times with same format. As a result, we can make this code and theory more specific and make it can make the current network and task well, changing common function and theory to specific one always can improve the efficiency of training and finetune. YouTube I upload two videos about the solutions of leetcode problem this week, it have plenty of benefits, for example, practicing my english speaking ability, prepare for english interview, try to be a youtuber. Daily 9.9: running, prepare huggingface token 9.10: reproduce code, but result is different, maybe cause is the random seed 9.11: read related HN post and blogs about unsloth 9.12: read fast_crossEntropyLoss code to understand why it works 9.13: Try to solve another issue but failed 9.14: publish PopTranslate 9.15: Rest Personal life Reading alphaxiv: it’s perfect for both idea and value Front-end UI: many great design even typescript so many post tools with AI Exercise Runing five time this week, half hour every time, it’s a good start, If I can insist on this week, maybe I should price me some equipments. Thought 1) unsloth also reference some other paper, there are plenty works can be done about this topic, it’s also hard, I think it’s worth it. 2) publish new project is a great think, I can receive thank for others even it’s a small thank. Own proejct can give me a feeling of children, it’s totally different. This week Focuse on unsloth: less than expectation PR Reproduce Know the theory of acceleration and product a essay Reading: spend some time on papers Exercise: Running in the mornings: better than expectation Social account Next week Focuse on unsloth Validation of thought Know more Try new thought Reading more Exercise Produce more valued content

Weekly-#3 PopTranslate

作者 Benson
2024年9月8日 00:00
Product process PopTranslate I finally finish this project, it works on my computer and my gf computer, I like this project. Sad news is my review for this extension was rejected two times by google, I didn’t know the reason ans ask for further review. Other product attempts 1) Text-to-voice in-browser After know about transformer.js which aimed to bridge the gap between web and LLM, I tried this method to realize this project, but it failed again. I also meet some errors that I cannot solve in short time. In terms of the reason, transformer.js is not mature enough, more deep knowledge and techniqu in audio field are necessary to solve it. Maybe I will do that if I have free time in the future. Consequently, I hold this proejct again. 2) Analysis 500 startups supported by YC Due to the lack of fresh idea, I analyse the 500 startups supported by YC in 2024, some conclusion as below 50%+ is LLM or AI related based on title, real value is higher than that. 64% is belong to B2B category. This is majority area of AI application. 10% is belong to healthcare categoty as the second biggest part. Education is smallest category, occupying 1.2%. There will be more opportunity if AI became more accurate. Projects that are great and suitable for me unsloth: LLM finetune accelerate gpt-pilot: Making AI deveop like a real developer, this will be the next stage of AI code transformer.js: bridge the gap between web and LLM, from huggingface 3) LLM finetune accelerate Request my first PR for this project and got merged, meanwhile, it’s the 1000th PR, just a coincidence. The open source version of this project can accelerate most of open source models finetune stage 2x, pro version can provide more support, best performance is 30x. It worths more exploration. Daily 9.2: PopTranslate 9.3: Publish PopTranslate 9.4: Reading 9.5: Reading 9.6: Try to reproduce a specific issue of unsloth 9.7: submit PR of unsloth. Try transformers.js 9.8: Rest Personal life Reading limu’s speech in shjt universion: recommendation for five stars AI tools help you grow fans Greppability is an underrated code metric: Useful and practical for developer, especially for how worked in big company Interest-base community have great growth: I have the same sense, interest is the final way to socializing Cost of LLM will continue to decrease – Andrew NG: Moore’s law will work in the latest field. Raise 1B with one html – ssi Photo has more power than text in terms of creation Next product of openAI will charge 2k/month? Architecture is the main problem when traing 400B+ model – limu Exercise three times this week, sad news. But I start to running in the morning, just start in the monday. keep insisting. Thought 1) Creating new product is a hard problem, I realize this more after I start to do it. 2) Creating entertainment content is a difficult task even TikTok is so popular all over the world. Hot content always need all necessary thing, fresh idea, capturing details, and so on. This is hard to created that by AI. In other words, AI is not enough smart to produce content which can attarct consumers. Consequently, AI is suitable for small business, this is the majority ustage of AI. 3) I’m tried, but I enjoy. Next week Focuse on unsloth PR Reproduce Know the theory of acceleration and product a essay Reading: spend some time on papers Exercise: Running in the mornings Social account

Weekly-#2 The fail of first product

作者 Benson
2024年9月1日 00:00
Product process Voice correction From monday to wednesday, I finish the frist verison of my first product, which is voice correction. It always have several drawbacks duc to the lack of high performance of my computer. This price of cloud computer is high, that is the main reason I don’t continue this project temporary. The first version is available for me or my girlfriend, but it is too shy to share to public. There are several problems. some voice will not converted to text sometimes, because of the lack of high performance of computer. better computer can illeviate this issue. The order of text is not totally the same with voice, because I didnot reroder it after convert voice to text with multi threads, I have own the related techniques. I will do that if I restart this project in the future. the performance of voice-to-text and text correction models is not always good, bigger model and better machine can diminish this issue. I still learn a lot from this project, for examle The definition of async function and how to use that in python, it is essentail in stream or speaking scene. The price of machine and the need of LLM for machine How to use discord bot, which is powerful The potential of web-gpu, which make lots of LLM product possible Other product attempts Collection of web-tools The collection of web-tools, the define of web-tools is that this kind of product don’t rely serivce. This kind of product have low cost and are easy to copy, there are less powerdown or other related issues for that. After explore for several hours, I found there is a open source product named (it-tools](https://it-tools.tech/) which is great. I don’t need to do that again. I move my sight to web-tools of AI product, it is also not an ideal prodcut becuase most of LLM product are not abvailable in only-web. Text-base: there is a related porduct named mlc-llm, this is fine. image-base: more of image model are 12B, it is too big to run in-browser video-base: it’s more impossible compare to image-base audio-base: voice-to-text: whisper.wasm is great text-to-voide: there is not related proudct temporary, I can do that. Text-to-voice in-browser Influenced by My collection of web-tools idea, I want to make a text-to-voice product just like whisper.wasm, there are several advantages. service is not necessary, I can host it on github pages, it can be free for public. it run only in-browser, there are not privacy concerns. it is useful for public, for example, I need that when I prepare my ielts test Direct translation extension Inspired by the pop up windows of a english learning platform, I want to make a translation extension, it can show the translation after user double-click the content that he/she select. Compared to other related products, it can show translation directly with only one operation, other proudcts need two operation to trigger translation. Now, I finished the front-end code, I need to add requesting translation API funciton to my project. I forgot to push code to github again, I don’t know it is essential in the initial step of a product. Personal life ITLTS The review of ielts speaking have a result, I got the score that I need. This is a great thing because I can put more attention on my personal projects. Reading I didn’t read as much as last week, because my products gone not very well. The CEO of telegram was arrested by frence government. Telegram is always a wired proudct, it is famous by security but it don’t open e2e as default, there are so powerful other function in this platform which are not related with security like bot, channel. The small tools in whatsapp: There are several so simple product but can bring so much profits. The profits are not alwasy related with its complexity remove background in using web-gpu: I think this kinds product are powerful, It will be great if we can make web-gpu support more LLM, like models in huggingface. exercise I start to do small exercise this week, I did at least finve times in this week, this is a good start. Next week plan for next week product realize the direct translation extension publish it to public generate more ideas reading more exercise take part in social group

Weekly-#1 First week of indie develop

作者 Benson
2024年8月25日 00:00
Backgroud When I received my ielts score at 8.19, I still didn’t get my necessary score for speaking part, But the difference between this time and last several times are that I think I already try my best in the ielts speaking part. So, I don’t want to continue to prepare for ietls because it is expensive and filled with random factors. Meanwhile, My study visa havn’t passed till now and I confirmed I trigger security investigate which will spend more time than regular visa application. I will delay my master program with high possibility if there is no suprise in next week. It is wired it need so much time for ircc to process the visa application, maybe I need other preparations after next week. As a result, for english, Firstly, I had already ask for a score review for ielts speaking part, I had already submit the application for Cambridge English. In addition, I will change exam from ielts to duolinguo, whose price are more reasonable and are relative easy to pass the exam if I have at least two months to prepare for that. More important, I am able to start my personal deveop career right now. This week is the first week for my to develop by myself towards to my personal goal even I had quit my job for over three months. Product Idea source When I prepared for ielts speaking test, I asked help for enlish techer online, I found the best services from teacher to me are show the grammar issues of my speaking, actually, I can provide enough long sentence to answer the question because I have already practice these topics for several times. My major problem is is grammer issues, like past tense or present tense, singular or plural, I made so much these kinks of mistakes. So, I want to make a product to point out the mistakes that peole who are learning english made. I will made it as a discord bot first. Maybe I will build a app for website in the future. Realize receive audio file or real-time speaking audio convert audio to text message correct the errors in sentences display the errors and correction Process Right now, it can receive audio messages and output what I want it to display as a discord bot. I am trying to realize the real-time speaking input, which is the area that I am unfamiliar with, consequently, I need more time to learn some new knowledge. Obtained new information receive voice data is sensitive for playforms: there are not offical API for discord.py to receive real-time voice data, I need to ask help from extension, it’s luck that there is this kind of tool. Running LLM on only web is possible: it’s not suitable for every LLM, but it is a low-cost way to achieve some specific products. There are also great tools that support it, like web-gpu. There are plenty of LLM with small size, which are friendly for personal developer or small business. Discord API is so friendly, My love for discord is increasing as I know this platform more and more. cloud machine is expensive, I need to spend almost 200 every month if I want to run the product in the future. Alternative product Several products are great but there are always some drawbacks All of them are expensive All of them can not provide best correction display format Some of them cannot keep correction function for a long time day records 8.19: organize my though about this product 8.20: explore text-to-speech and text correction model 8.21: try only js solution and give up. start to connect discord bot ensure text correction model and start to development 8.22: realize the whole process though audio segment input 8.23: connect real-time speaking and save data, but it isn’t match the whole project. Another solution need. 8.24: rest Reading Reading can help me to maintain the connection with latest development of techniques, meanwhile, it can provide more fresh information to me. levelsio: indie hacker who pursue extreme freedom made numous products but only several of them achieve success, which are enough for him to living and pursue his goal without much life related pressure. most of his successful products are related with AI, which bring lots of opportunities to the internet. DeskHub: hardware product of github bars, so fun learning english though printing subtitle with youtube Run python code in javastript: I believe I will use this tool in the future Build products in several platforms though the same code What product are we building PDF to markdown: It use LLM to realize, which cost so much resources. discord music bot: for fun real-time face swap video: this is useful in xxx fields, wired. write two solutions for leetcode, I want to make some videos if I have extra time. Thought I am happy that I had started my product right now. The passion for new product will diminish and flucate with time, it is the most strong period before you start to realize it. It is hard and crucial to maintain the balance between passion and clear thought. I should push code to github, it seems like I take part in huge open source project. I am too heavy, I need to loss weight I start to write english content like this essay, it is nature and comfortable to continuous writing since I started. Plan for next week product: realize the real-time speaking make a landing page publish it in discord channel spend some time on exercises some time for duolinguo test reading and generate new ideas for next product record videos for leetcode solutions 2024.8.25

slack迁移discord

作者 Benson
2024年7月19日 00:00
背景 之前加入的一个slack留学群组,历史记录里有很多宝贵的经验和文件,但slack免费版本只能显示最近三个月的消息。 最近社区有网友提到了这个事,建议群主把历史信息迁移到discord平台,群主也把历史记录down下来了,就万事俱备了。 恰好呢,我7.16日刚考完雅思,等结果的这两天可以把这事儿搞一下。 迁移 之前调研过,有slack迁移discord的工具特别好用,几乎无缝衔接,让我比较惊叹的是,一条消息下的回复列表,也能按照原格式迁移,几乎是无损迁移了,有了这个关键点就能很大程度保证迁移后的用户体验,其他问题感觉就不那么重要了。 具体步骤按照原作者给的github的操作路径执行就可以,比较费事的是,怎么给discord群组按照一定权限加机器人,我找了半天,竟然是要自己构造一个链接,机器人的权限信息就在链接的参数里,访问这个链接点击确认,就加进去了,这个用户体验着实有点别扭。也可能是我没找对姿势吧。 主要问题 真正迁移的过程中,遇到的大问题就一个,每次程序跑了3-5小时,就会因为网络原因(或者其他原因,我看着像网络问题)断开,而且,整个程序是先初始化了 discord 的 client,再一条条把消息通过bot发到discord群组,网络原因嘛,第一方案就是重试,但发现只重试 “发送消息” 这个小步骤没有用,还会持续报一样的错误,这里我确实找不到原因了。 所以呢,就只能在最外层加重试,也就是每次失败后,重新初始化client,这样能解决失败的问题,引入的新工作量是,需要加一个缓存(姑且叫他缓存),记录下上次失败时迁移到哪个位置了,这次从这个位置继续,避免重新迁移。 所以最终运行的代码版本,相比原版,就增加了外层重试和缓存的逻辑,算是成功迁移了。最终版的代码,提了PR,但感觉加的功能比较挫,大概率不会merge,留在那儿给后人提个醒也行。 效果 支持能的功能如下 评论:每条消息下的评论,能按照原格式迁移,无缝衔接,很棒 内容:除了文字、链接,图片、附件也能正常显示;过大的文件无法迁移 用户:能迁移头像和用户名,太强了 时间:新平台的内容发送时间为迁移时间,非原消息的实际发送时间;新平台每条消息前面会附加真实的发送时间,作为替代方案,这是原工具的功能 附件:有附件的消息新平台会分成两条发送,第一条是附件以外的文字信息,第二条是附件 详细记录 这个过程总计耗时2天,程序跑了24小时左右,详细的调试过程如下 7.17 14:00 跑一个小时失败,网络不稳定,换香港服务器重跑 7.17 21:00 跑3.5小时失败,网络不稳定,给await加了10次重试,每次5分钟 7.18 07:40 跑了4小时失败,迁了一万条消息,运行时间、成功迁移的数量跟上次差不多;问题是上次加的重试覆盖范围不足,扩大了范围,9:45 重跑;关注下内存和cpu 7.18 12:32 跑到 college 申请群 2024-03-18 出错,跟上次出错不是同一个位置,非内容问题 修改成出错超过10分钟就跳过,重新跑,12:32 开始跑 结论:1)内存稳定,不是内存问题 2)内容跟上次出错位置不同,非内容问题 7.18 15:13 跑到 社区主频道 2022-11-01 出错; excption 代码逻辑出错导致,修改重跑, 15:36重跑 报错信息:疑似网络不稳定的问题 结论:重跑能成功,非内容问题 7.18 20:24 18:47 断了,跑了三个小时,未知原因, 7.19 8:38 添加外层重试和缓存,从主频道开始跑,跑了10个小时跑完 共72943条消息,社区主频道之后迁移失败消息有134条,一共预估300条左右,占比不到0.5% 致谢 感谢原作者的代码 感谢群主提供的机会

雅思备考 2024Q3

作者 Benson
2024年6月30日 00:00
方法 半吊子备考了一个月,6.18日成绩6.0(L 5.5, S 5.5, R 6.5, W 5.5),虽然确实投入的经历不够,但感觉也确实没找对方法,就重新梳理了复习方法。 听力:在朋友的建议下,开始精听。听完听力后,默写下来,跟原文对比,复习对不上的部分。再说,保证口音跟原音一致。 口语:练习的量不够,从最近两次考试的经验上看,需要练习60个part2可能能覆盖到近期的题目。另一方面也是纯粹练习口语能力。 阅读:分数已经够用了,暂时不专门复习阅读了 写作:9分学长的课感觉挺好用的,已经看完了,现在多练习,写完对比范文做完善。 记录 6.28: 听力:精听一个s2 口语:练习一个p2 写作:无 6.29: 听力:精听0.5个s2 口语:无 写作:练习一个t2 6.30: 听力:精听0.5个s2 口语:6个p2 写作:1个大作文 7.1: 听力:精听1个s2 口语:14个p2 写作:一大一小 7.2: 听力:一套听力测试,6.0,填空题做的不好;练习王陆听力单词 1/3,740词左右 口语:练习14个p2 写作:4/12 语料练习 7.3: 听力:一套听力测试,5.0,填空题格外差;练习王陆听力单词,进度 2/3 口语:14个p2 写作:8/12 语料练习;一个小作文 7.4: 听力:一套听力测试,7.0,选择题做的不好;练习王陆听力单词,进展3/3 口语:2个p2 写作:修改小作文,修改到6.5分,总结常见用法;12/12语料练习 7.5: 听力:一套测试,6.5;王陆听力单词复习,280个不熟悉单词 口语:12个p2 写作:文化话题复习,练习了两个单独的分论点,但无法批改,不太方便;同时语料中能用到的内容占比并不高,35%左右;一个小作文练习,未批改 7.6: 休息一天,出去骑车 7.7: 练习较少 听力:一套测试 口语:1个p2 写作:一个小作文 7.8: 听力:测试三套机经,均分6.5 口语:无 写作:一大一小;6.0 7.9: 听力:测试4套机经,均分5.5-6.0,又摸不着头脑了 口语:练习10个p2,练习完了一轮,明天开启第二轮 写作:两个小作文;5.0-5.5,感觉chatgpt估分不太准,唉 7.10: 听力:一套精听;三个机经 写作:两大作文,一个小作文,有足够时间能到6.0 7.11: 听力:5个机经,均分5.5-6.0 写作:两个大作文,还是写的慢 口语:1个p2 7.12: 听力:4个机经,均分5.75 写作:两个大作文;修改后6.0 口语:练习 7/50个 7.13: 听力:4个机经,done,均分5.75 写作:1个大作文,6.0 口语:二次练习高频题 25/50 7.14: 听力:复习5个机经 写作:一大一小 口语:二次练习高频提 50/50 7.15: 听力:复习一遍机经 写作:无 口语:又练习了一遍高频的50个p2 7.16: 考试 口语:p2果然命中50个高频话题了,但p3比较懵,基本每个问题都只回答了两三句,不太会扩展。中间说了三次sorry,问题太长的时候容易左耳进右耳出,听不全。 感觉要比上次好一些,6.0应该差不多。 听力:填空题做的还不错;选择题就比较完蛋,能听出来部分排除项,但基本没怎么听出来答案,比较看命了。 阅读:前两篇很有把握,第三篇基本就看不懂了,第三篇还有填空题,每空三个单词,难度有点高;跟以往持平,考个6.5应该没问题 写作:大作文花的时间比较长,47分钟,看完又review了一遍,比前几次都好一些;小作文写完只review了一半,就结束了,整体感觉比上次好一些,就看能不能提分了。 风险点太多,感觉比较难直接过。 考完感觉身体被掏空,头昏脑胀的,最近备考的压力太大了,需要休息一下;一切都等周五出结果之后再说吧。 7.19: 出分,6.0 听力:6.0,还不错,但是不太有信心,接下来还要复习 写作:6.0,ChatGPT批改就是比较严格,保持这个水平即可。 口语:5.5,最差,感觉就是p3占的比重太高了,这是接下来复习的重点 阅读:6.5,没有提升,争取下次7.0 7.20: 决定7.31日重考 口语:作为复习重点,报名一对一练习,大量练习p3 听力:上一节课,继续精听,争取提分 阅读:学技巧,练习,争取提分 写作:练习,加快速度,保持住6.0 7.21: 口语:确定课程,买了10节一对一口语课;开始练习p3,练习了6/18页; 听力:上一节课,跟读0.5篇文章;读王陆的时间、日期等等 7.22 口语:上一节课,复习,p3练习10.5/18 听力:精读一个 s2, s3 写作:p1练习,chatgpt 4omini 竟然打分6.5 7.23 口语:上一节课,p3练习15/18 听力:精读一个 s2, s3 阅读:练习+批改一套,6.0,太辣鸡了 7.24 口语:上课练习p1,p3练习18/18页,补充了6个p3 阅读:练习一套,6.5,27道,要把握好时间 7.25 口语:练习10个p3话题 阅读:练习一套,超时15分钟,6.5,感觉技巧把握的还不够好 听力:练习一套,5.5,填空题做的太差了,需要再背单词 7.26 口语:练习5个p3话题,上一节口语课 阅读:练习一套,6.5 听力:练习一套,5.0,水平飘忽不定 7.27 口语:练习10个p3,一节口语课,模拟考试6.0,置信度不高 阅读:练习一套之前做过的,7.0 听力:练习+精听一套 7.28:报名8.1重考 口语:练习10个p3,一节口语课 作文:一篇大作文,6.0 7.29 口语:练习5个p3,一节口语课 阅读:一套,6.5,差一道题7.0,可惜 听力:一套练习+粗听,6.0,差一道题6.5,可惜 7.30 口语:模拟考试,5.5分,练习5个p3 阅读:一套练习,6.5 听力:一套练习+粗听,6.5 7.31 口语:一节口语课;练习剩余的p3,一共复习了54个p3;把54个p3的一分钟笔记练习了一遍;复习了比较难的p2 阅读:一套练习,6.5 听力:一套练习+粗听,6.0 8.1 考试 口语:p2命中”买便宜东西的人”,但是he和she还是容易弄混。p3第一个问题回答的还可以,后面的部分听不懂在问什么,剩余4分钟都在掰扯了,一片混乱,5.5可能性偏大。5.0也有可能 听力:手机号码最后一段没听出来,perfume 少了最后的e,正常发挥;第二篇很不错,第三篇极差;第四篇正常发挥。6.0的可能性比较大 阅读:前两篇基本都看懂了同意替换,第三篇有三个空是懵的,剩下的平均水平。感觉7.0的可能性偏大 写作:先写了小作文,用了27分钟,策略失误了。大作文production, consumption 实在想不到同意替换,写完re了半篇结束了,正常发挥,感觉6.0 结论:出结果之后,看是完全重考,还是单科重考口语。 8.2 收集到的建议: 多听问题 多看问题和答案,有利于理解问题 how问题可以从多个角度回答,多看答案,补充idea 多练习 口语:收集part3问题,生成音频,听了500个问题 其他:给对象做了个分号的工具 8.3 口语:练习了3个p3话题回答;又收集了500个当季的p3问题,问题比昨天更难一些,听了一遍+复习不熟悉的问题 8.4 出分,想比上次,听力下降了0.5,懵逼。。。买了多邻国的教程,想考多邻国了 8.5 多邻国太难了,还是雅思吧,不要慌,先复习,大不了延期,不能有太大心里压力 复习方法: 口语:复习听力,保证能听懂;练习口语 听力:精读+背数字+背单词+练习+总结 阅读:尽量每天练习一篇 写作:不练习,稳住就行 练习 口语:练习3个p2 听力:复习3套+总结;听力单词生成句子和听力,多听几遍 8.6 口语:练习5个p3话题 听力:读一遍数字;背一遍单词,减少60个单词;练习3套+2遍精听 阅读:三篇文章时间分配应该是 12/18/25,练习p3,25min对12道 8.7 口语:练习5个p3话题,2个小时 听力:练习3套+1遍精听,感觉有提升;读一遍数字;背单词,减少15个 阅读:练习两篇,带上之前做的,耗时 15/21/25,7.0,勉强达标 8.8-8.11:回家休息了4天,平均每天练习半小时听力 8.12 口语:练习2小时 听力:练习了三套+一遍精听,两个6.0,一个6.5;读了一遍数字 阅读:练习一套,新策略7.0 8.13 口语:练习两小时 听力:两篇练习,两个6.5;读一遍数字 阅读:两篇,一个7.0,一个5.5 8.14 口语:练习8个p2,大概5个p3;一个mock test 6.0 听力:一篇练习6.0 阅读:一篇练习7.0 8.15 口语:练习了将近30个p3话题 听力:无 阅读:无 8.16 考试 口语:说的很多,考官也不打断,p3似乎就问了三个问题就结束了,说明我说的真的很长,但用的高级词汇并不是很多;大概率要上6.0了 听力:第一篇四道固定选择题,第二篇的6道填空题,第三篇的5道固定选择题,感觉答的都不错,其他选择填空答的比较一般,感觉6.0和6.5五五开 阅读:第一篇耗时16分钟,太久了,这次三篇的难度差异没有上次那么大,答的还可以,但每篇都不是特别好,感觉大概率要6.5了 写作:好久不写了,果然手生;先写大作文,用了45分钟,还没时间review;小作文写的比较拉夸;大作文感觉还可以,应该还能维持6.0 8.19,出分 口语:还是万年5.5,无法相信,过几天打算申请复议了 听力:7.5,题目对口,也有运气加成,简直了 阅读:6.5,符合预期 写作:6.0,符合预期 日期 听力 阅读 写作 口语 总成绩 2024.8.16 7.5 6.5 6.0 5.5(复议6.0) 6.5 2024.8.1 5.5 6.5 6.0 5.5 6.0 2024.7.16 6.0 6.5 6.0 5.5 6.0 2024.6.18 5.5 6.5 5.5 5.5 6.0 2024.1.21 6.0 6.5 5.5 5.0 6.0 2023.12.24 5.5 5.5 5.5 5.0 5.5 8.23 申请复议 为什么申请复议 这一次说的比前四次都好非常多,说的更多,停顿更少;词汇更好 提前模拟四项目评分标准都是6.0 这已经是我最好的状态了 如果复议失败,可能导致失败的原因有 p2 答案形式比较单一,一直是 I 作为主语,但感觉这个影响不那么大 词汇不够高级,准备的有高级词汇,但要根据话题来看,很大概率用不上 语法错误,老毛病了,很难短时间内修正 再提升难度比较大了,不管结果怎么样,都转战多邻国了 8.30 收到复议结果,口语 5.5->6.0 努力之后有回报的感觉,太爽了

中文博客合集

作者 Benson
2024年6月29日 00:00
背景 V站发布了新功能VNXA,有网友多次呼吁站长添加评论功能,似乎一直没下文。感觉是个非常好的思路。 浏览了几天VNXA,发现收录的博客比较少,每天更新的只有10篇左右,还在慢慢增长 于是我就想自己做一套这个系统,但评论需要用后端,比较耗时间,就降级先做个类似VNXA的功能,用 github pages 部署,不用自己搭建服务端,能支持自己平常摸鱼刷博客就行。 点赞 github pages,真的太好用了。 动手 主要通过两个功能来查找博客 独立博客集合 友链扩散 找到博客后,通过rss订阅找到博文,这样就能自循环不断添加新的博客了。 成果:zhblog,目前每天8篇博客左右,数量还比较少。 本着先做出mvp的原则,原始功能是都实现了,还有一些问题以后再看要不要完善吧。 友链扩散逻辑很粗暴,目前扩散了2.3万友链,但目前运行3小时,只能请求2000个,想把2.3万友链全部请求的话,得考虑加缓存或者多线程。 rss请求+解析失败率60%,感觉需要 by case 优化了 评论功能,需要搭建个服务端才行 滑稽 抛开评论系统不谈,做完了发现市面上已经有同类产品了,最好的当属 BlogFinder,它似乎是手动收录的博客,因此质量比我的高很多,很赞了。 感受 最近还在准备雅思考试,但抑制不住冲动,还是花了两整天的时间在这件事上,如果时间充足的话,我应该很高兴能做类似的产品。 离职后除了准备雅思,这是做的唯一一个成型的项目,收获了久违的满足感,果然还是动手做起来最重要! 考完雅思后继续加油。 附录 成果:https://informal.top/zhblog 代码:https://github.com/wa008/zhblog BlogFinder: https://bf.zzxworld.com 同类产品,积薪:https://firewood.news 中文独立博客:https://github.com/timqian/chinese-independent-blogs

English Diary in May

作者 Benson
2024年5月8日 00:00
I start learning english with a plan, I don’t have writing task everyday, so, I think this place will become my Enlish Diary to make me writing something everyday in english. 5.8 The hourse I rent now will be out of date in june, I looked for the next apartment for my girlfriend and I, of course, my gf is the main memeber to stay in the new apartment. It is a complex process to pick a suitable apartment, we look six apartment but niether of that is suitable for us. In the afternoon, The learning process is hard, I can’t help to play my Phone when I am learning. Anyway, I still finish my reading task today, in the evening, I ride my electric bike to take my gf back home, the air temperature make the weather great, I am also full of satisfaction because of the finished task in the afternoon, That moment make the hard process in the afternoon worthwhile. It will be better tomorrow. Good night. 5.11 The day before yesterday, I finish a ielts reading task, thereafter, I read the translation carefully. The essay says a story about art, specificly, it talk about the painting. For most people, we always go to museum to enjoy the famouse painting when we want to see some. But for novel, we don’t usually look the original handwriting version, instead of that, we will read the copy version of that. Nowdays, the copy technology is good enough to copy a painting with all characteristic. The essay express that the owner of original painting is one of the reason to encourge perple to enjon orginal painting instead of copy version, that can keep the value of original, not just owner of painting, the owner of museum, relative to original painting may all have this thought. I don’t know if this message is true or not, what impress me deeply is that there are so much critical thought and essay when I read english essay. Critical thinking is helpful for seeking facts and essiential for the society. 5.12 Finally, my body total recover from the trip to huang mountain. restart to running today even it don’t last long time. 5.14 Today, I already studyed with a cycle plan, subsequently, I reviewed what I learned in the last seven days, I find that I don’t learn that well, I don’t have enough patience when I did some task. This plan is not good enough, at least is not suitable for me. Tomorrow, I will change my learning plan of preparing my IELTS test, hope next plan is better. 5.17 Since quit from job, I don’t have much press in daily life and always made heated and much long dream when sleep in night. I feel so tried after I wake up every morning, It made the bad effect to my whole day. I search much solution about this problem, finally I find a effective one. I will read book rather than playing Phone half hour before I go to sleep, that will make me have just a little dream, what’s more, The dream I had is not so heated, that make feel better in the whole day. The original plan is finish a set IELTS test everyday, of course I can’t do that much actually. Every time I finish one of the four parts task, I will give myself a break, That’s why I write these words. Writing words is better watching short video at least. 5.19 I was burned by boiling water, it ached so much at that moment because you can run away from that pian. We went go hospital immediately and doctor give me some treat. Today it become better, but there are some bubbles on my right arms, I must update the drugs on my right arms a few times in the next few days. What a special experience. 5.20 One of the most important skill about english listen is you can understand the meaning of a sentence when don’t need read it again, This is also a big characteristic of speaing, you can only listen essay one time.In reading, I can still practice the skill, I can mind myself to read to sentence one time, don’t read it back again. This’s also useful to make you read artical more quick. Learning ielts reading passages really expand my horizon, in most situation, only male deer have antlers and female don’t have, that’s is part of the reason I give a wrong answer for one question. 5.23 Stop update the Top Hacker News from today 5.24 I’m stay on the life with enough time for almost a month, I still spend much time on social media, that can not bring me satisfaction. I think I was tired of watching news on social media already, since yesterday, I control myself against the news on social media, I feel better rencent two days. Keep insist on that.

五一游记

作者 Benson
2024年5月6日 00:00
出发 没赶上高铁,1000块解锁了新知识,黄牛是可以把直接把你带进站的,车开走之后直接补票就可以,对,没有票也可以直接补票,列车长已经习以为常了,平常赶不上车着急的话也可以找黄牛。 有需求的地方就有生意,高铁规则下也有暗地里的产业链。 武汉 武汉空气湿度很大,气温高的时候很难受,气温合适的话,就太舒服了。 武汉地铁广告没有北京那么多,商业化还有继续发展的空间,广告商们有预算的话,可以参考北京增加广告位。 地铁里乘客笑脸比北京多多了,幸福指数比北京高很多,但也可能跟五一节假日有关。 住的酒店离武汉站18公里,但附近的居民楼、饭店都挺多的,还有不少别墅,感觉武汉发展的很均衡,不像北京几乎是严格按照环往外发展的。 武汉地铁站已经修道18号线了,北京已开通的最大数字是19,发展的真好。 晚上吃了武汉的鱼,种类很多,具体已经记不得了,但味道很不错,不愧是水资源丰富的城市。 东湖真的大,绿道简直是自行车爱好者的福音,有这样的环境,感觉附近绝大多数小孩都会爱上骑自行车的。因为东湖太大,五一游客也多,缆车排队太久,都没有爬山,没有逛具体的景区,只在湖边骑了一圈车,不过周围的风景也很不错了。 远远望了一下黄鹤楼,看不到什么奇妙之处。受长安上万里电影的影响,崔颢写的黄鹤楼是真好。更喜欢的高适的一句话,“只要黄鹤楼的诗在,黄鹤楼就在”,一切物质的东西都无法永存,艺术(诗歌)才能。 武汉长江大桥1670米,我们去的可能不是特指的那座长江大桥,但看起来也很壮观了,能在一千米宽的河上建一座桥,仅仅8个桥墩,建造工艺确实令人惊叹。 晚上去了朋友家,买房后的生活,幸福水平真能提高不少。 上学时大家都是一样的学生,一别五年,每个人的生活状态和规划都不同,真是有趣,不知道再过五年后又会是什么样子。 黄山 送我们去名宿的司机师傅很有趣,特别喜欢开玩笑,幽默的人自带一种魅力,让人不自觉产生好感,幽默的人也一定热爱生活,因为只有热爱生活的人才有精力去幽默。 黄山景区外的商业街很长,所以物价还好。臭鳜鱼味道还不错,毛豆腐实属一般,笋也是特产,但饭店老板说都是苦笋,就没有尝。晚上吃了当地的浇头面,经典浇头里面有竹笋、香菇、平菇、豆干好多种食材,味道很鲜美,这样口味的菜就像一个低调的富二代。烧饼味道也不错,离开的时侯还带了一些。商业街隔一段就有一家兰州拉面,连这种地方也有这么多兰州拉面,真是惊到我了,兰州人真是挺会做生意的。 第二天爬山,5:30起床已经不属于最早的那一批了,太卷了。因为下雨,上山的索道上看不到什么景色,周围雾茫茫的一片。上山的过程中有一段还挺堵的,大家都是这个点上山的。 黄山上的松树是真的多,不少松树都长在悬崖上,还要经受风雨的摧残,也难怪古人都称之为劲松。很意外还看到了两只松鼠,不由得觉得松鼠和松树之间,应该是有什么联系。 大部分时间山里都是雾茫茫一片,可见度极低,看不到什么景色,全程有那么两分钟,不知是出了点太阳还是怎么得,雾被吹走了不少,露出了光秃秃的石头山,景色确实不错,拍到了今日最佳,哈哈哈哈。 除了正常的上下片,还徒脚走到了谷底看大峡谷,又重新爬山去,所以晚上回去的时候,腿和脚已经累的不行了。 第二天,整个下半身都是酸爽的痛感,好久没有经历过这么大的运动量了。

开始休假

作者 Benson
2024年4月24日 00:00
工作收尾 从这周一开始就休假了,今天周三,已经是第三天了。不工作之后,感觉每天时间挺充足的,能做很多事情。 周一一大早,先把最近两年的工作内容整理了一下,写了一个总结,以后找工作整理简历的时候可能需要用。单看总结的话,感觉自己真是做过不少业务,但细看里面的内容,实在是缺少一些系统性的规划,更多是走到哪儿算哪儿,本质来说还是缺少对一个业务的深入理解,就缺少了把控感。 跳出公司的范围看,在大厂工作确实比较像个螺丝钉,大厂的五年我已经做了很多项目,做过很多不同的业务,了解上下游和对应的技术栈,但终归都只到了解这个层面,实操起来肯定是无从着手。 好的点在于我已经见过很多不同业务的技术栈了,像在线服务、流式系统、离线等技术框架,业务的报警、处理、值班等模块,如果现在需要我做一个产品,或者负责一个业务,至少心里有个框架直到要怎么做,这就是这五年的收获吧。 生活 不工作之后有大把的时间做各种事,比如骑车、跑步,跟朋友约饭,当然也是离职的同事,哈哈哈。也有更多时间去买菜做饭了,今天还做了午饭给对象送过去。 不工作之后的生活,仿佛回到了更原始的阶段,远离了餐厅、外卖、班车,回归到买菜、做饭、健身上面,仿佛是在提前适应去加拿大之后的生活 英语 最近两个月最重要的事情就是学英语了,六月底英语是一定要考到6.5的,否则估计就要延迟入学了。 学语言不像学习其他知识,它不是固定的内容,既要保证学习的灵活程度,又要真能提高语言能力(考试分数) 今天开始看英文书,这本书之前看过纸质版的了,书名 《show your work》,前几章在讲无论每天做了什么,都要尽量挑一些内容做一下总结和分享,每天分享自己的成果,当以后求职时,这就是最好的能力的证明,也就是看到这里,我开始打开电脑写下这篇播客,哈哈哈哈。 插个题外话,写到这里的时候想起来一个事儿,最近有一个 reddit 上的网友私聊我,问我要不要租OTU附近的房子,他应该是根据我发的帖子看到的我,说他们有个house,3个bedroom,目前只有他和他的老婆在住,问我要不要租其中的一个房间,还提醒我说一楼就有洗衣房,虽然我不懂这个提醒有什么用,但能说明是个很nice的人。我从他的发帖记录上看,也是个老网民了,人挺nice的。无厘头的,我能从他身上感到他对我的好意(因为那个房子挺好的,他完全可以去其他平台上找租户,没必要自己费劲找),我感觉他大概率是浏览了我历史的发帖记录,看我也是个很nice的人 (回去check了一下,原来是有回复我在学校论坛发的帖子),所以才这么私聊我的。 这个房源800CAD,骑车到学校15分钟,如果他不是个骗子的话,我是可以租的。人还没到就遇到这么nice的人,再加上我一个在多伦多读书的同事和他介绍的同事都是很nice的人,让我感觉那边全都是很nice的人,期待!

learning english in April

作者 Benson
2024年4月24日 00:00
I start writing something again on my blog platform. 4.24 The topic today is about city, city is a common topic in ielts(International English Language Test System), for many other topic, it’s still relvative to city. First is about transportation system, transportation system is one of the most important part for a city, it decides how convient and fast people can move from one place to another place. For people who is employee in the company, the convenient transportation can make them having more chooses when they are seeking a job or considersing change a job. for beijng which is almost the most biggest city in china, people who living in bejing can work in any company as long as it’s located in bejing, the apartment of one of my pre-colleague is away 33km from our company, it only take one and half hour on the road becuase beijing has the most convenient subway system in the world. Second is price, In rencent years, it’s getting increaseingly expensive to do or buy anything, include the necessity, people must pay more to buy something than in the past. so, people must earn more mongy to deal with the expenses. This situation is more serious in the big city relative to small city and town. Third, it’s about job oppotunity, there are more and better jobs in big city, that is why more people will leave their hometown and go to big city. The population in big city is going up in rencent yeats. It’s not convenient to take care of their parents for the people who settle in another city rather than their hometown, especially when their parents becoming older. It’s a serious problem that the government should pay attention to. 4.25 Part is today’s topic, park is a great place to walk and enjoy nature view for people lived in the city. The summer park is the most famous park in beijing city in my heart. It’s built about sixty years ago. in history, emperor usually spend summer season in summer park becuase summer park is a special palce which can provide cold environment in summer, That’s why it is called summer park. There is a big lake called kunming lake in the center of park, which occupied senventy-five percentage of the whole erea of summer park, This is one of the main reason summper park is relatively cold in summer. There is a little mountain in the north of the park, many temples are built on the mountain, people visited summer park usually ask for good luck in these temples. summer park is also full with trees and grassland in the other place, that make the air cold in summer. There are lots of people visited summer park every day, in summer, they can enjoy the cool environment in park, in winter, they also can enjoy the beautiful lake view standing on the mountain. The great park not only attract local people to visit, but also foreigners, you can often see foreigner when you are walking in summer park. Chaoyang park is the biggest park in the east of beijing, The day before yesterday, my colleagues ask me to go to the biggest book market in beijing which is hold in chaoyang park, it lasts for eleven days. There are lots of selling book stall, I never ever see so large palce which is just selling books. beijing book market also have very long history and the scala of beijing book market twenty years ago is ten tieme than it is right now according the information from one of my colleague who lived in beijing many many years. Information cocoon room is a popular word rencent years, it mean people always stay in a small and stable information, they can’t touch other information. Book market is a great place to break the information cocoon, There are all kind of different books in book market, There is a possibility for everyone to touch the book which is inaccessible in their daily life. But with the development of computer, like personal computer and smartphone, people are spending more and more attention on social media like tiktok, They don’t have much time to hang out on book market and read books. In my previous thought, I should wirte nature view about the park topic. But I write many my thought about the society and life, writing is so great way to help people to deeply think, and it’s not hard to do that. I should insist on that. I can talk more nature view next time. Now is half past ten, it’s time to make lunch. 4.27 Writing is good habit. It will be great if I can write something everyday about today before go to sleep. As I readed from twitter, There is no smart people in the world, just the prople who have some good habit. Yesterday, I receive the bone conduction headphones I buy from xianyu platform which is a platform for buying and selling old thing that owner don’t need. It not only support song which is storage in itself, but also can connect Phone by bluetooth as a general headphones. I can listen the sound from environment at the same time when I’m listening the sound from headphones. It’s a great tool for me when I do some sport like running, swimming, riding. It not only make my enjoy the music, when some bad thing happend and make some sound, I can listen that sound and responce to make me safer. Afternoon, I go to company to sovle the last question from my colleague who take my previous work. As same as the old situation, there are so much little thing they must solve, I have the old and similar feelings, It seems that I back to work again. In the evening, I have dinner with some fiends who were my colleagues two years ago, everyone have their different path to future. Talking is always making people happy. One of my colleague give me some advice about my path to canada, there are so much uncertainty about my future, I have concern too, Anyway, I think I should have a try. The only require from my university about my ielts is overall score, they don’t have special require for more specific test score, like speaing, listening, reading, writing. So I start take the ielts test question which is the really question in ielts in the past again. In the evening, I go to run as the day before yesterday, the running people less than last, maybe because today is Saturday. 4.28 ielts writing test It is important for children to understand the difference between right and wrong at the early age. Punishment is necessary to help them learn this distinction. I total agree with that it is necessary to punish children to make them distingush the right thing and wrong thing when they are young. Firstly, the teacher and parents should give some small punishment to children to make them understand that they do wrong thing. For exmaple, if they waste food when eating or make the desk, floor dirty on purpose, techer and parents can stop them and tell them that is wrong. food is precious resource in the world, it is wrong to waste them. dirty floor will make other person easily get sick and give more work to cleaner. Teacher and parents can let children read relative books and tell right thing to them, ever more, they can give some small tasks relative to food and cleaning, that can help children understand the value of food and cleaning. Secondly, the prupose of punishment should be make children understand the right thing better. At the same time, punishment should be achieveable for children. Teacher and parents can do the punishment to children if it can meet the two condition. Finally, if techers and parents don’t give punishment to children, it is hard to make children understand and remember what they are doing is wrong when they do wrong thing. The effective and appropriate punishment is a suitable way to make children understand wrong thing and help them avoid that when they meet the similar thing again. In summary, appropriate punishment is absolute/definately necessary to help children distinguish the right thing and wrong thing, techer and parents can use them to educate children and make them better. The toppic todya is hard, I don’t think I can get a high score if it is the real test. The answer have the different opinion with me, so it doesn’t have much value for my answer. today Today, I spend most of my time on reddit and discord, I want improve my english commucation skill by chatting with english speaker. After trying about four hours, I find I don’t have much content or question to chat or share, at the same time, it is not very efficient to learn english compared to reading english artical and wrting egnlish blog or ielts test. For chatting with english speaker, the more important point is find what topic do I like most. After I have a discuss in a topic that I’m skilled and interested in, that is possible that discuss can last longer and be deeper. coding is obvious a good thing to do. codeforces, advant of code, llm.c are good options for me. I should pick one to start tomorrow. 2.29 I start solve problem on advant of code, 2023 25 is the last question on this website and I try it first, I have never thought it will be so hard for me. This is a problem about gragh, I spend the entir afternoon and can’t find a way to solve that. This is a similar feeling, reming me the time I prepare the icpc acm contest in 2018. Of course, I don’t learn any english about ielts today, bad thing! 2.30 The advent of code problem I picked is hard for me, that nedd a algorithm named Karger algorithm, I try to learn and write code to describe this algorithm, in one of test, I test code with vim, I delete the code that I speed two hours to wirte and it can’t recover, I hate vim much at that moment. After that, I still don’t find a good way to recover code after close vim. But what i learned is that i should return to console from vim by click ctrl+z, and click f+g to return vim edit, what I update can be recover base on this behavior, this is not a perfect solution but it can solve problem at some times. The real solution is be careful when I update code.

离职前的状态

作者 Benson
2024年4月6日 00:00
工作 今天是2024.4.6日,本是清明节最后一天假期,但因我们小组负责了一个重要的运营活动,今天要在公司值班,就在公司写下了这篇博客。 计划明天就要提离职了,所以近期工作的心态更开放了,对待曾经不喜欢的老板和工作都更宽容了,能更多看到好的一面。现在回头看,压力才是造成这恶劣的工作状态的根本原因,过度的压力下,个人想法、工作目的都变形了。 成长/创业/工作的过程,要打造一个循环,从宽广的视角思考,不断拆解,去验证去实现;从而回收结果,继续这个循环。但在公司过度的压力下,很多东西都变形了,比如 回收结果、向上汇报等等,同时规划的时候也会很畏难,因为担心规划实现不了带来的差绩效;丧失了互联网企业原始的自由和创造力。 当然,丧失创造力的一个本质原因,是能力的缺乏,因为无休止的工作填满了日常的时间,没有充足的额外时间去学习、思考,那创造力就无从谈起,似乎又把这个问题怪罪于无休止的工作。 读书 因为有了更多时间,也开始看书,几乎一周的时间就看完了《UNIX传奇》,大牛们的人生经历总是那么朴实而又光彩夺目,即使他们并不具备普遍性。书中讲述了UNIX系统、C语言、shell、管道、awk这些常用的工具,如果不是这本书,我只会觉得这些工具都是一些很trick的小玩意,不会想到是专门有人开发的这些工具,不会想到他们也是有作者的。这是我第一次这么完整的了解Linux的历史。 术中还有一些有趣的故事,比如开发出UNIX的第一台机器,购买时并不是为了开发UNIX系统,而是为了开发文本编辑软件,让贝尔实验室的工作人员编写专利申请文档时,效率更高,因为他们几乎每天都会申请一个新专利。似乎很多伟大的作品被创造时,他的原意都不是为了创造这个作品,只是个当时的副产品。 读书带来的满足感,真是陌生又熟悉。 未来 没有离职的时候会畅想,如果自己离职了会学习XX、XX等等等等,近期就尝试了一下,学习新领域的知识总是个艰难的过程,需要从0学起,脚踏实地,所获得的满足感自然就没那么强烈和直接,但要尝试去享受这类事情。 依然很庆幸自己有这个机会gap几个月,有这个机会重新开始一段人生,正常情况下,以后应该不会再有这样的机会了。

2024-01-01

作者 Benson
2024年1月1日 00:00
回顾 回看2022年给自己定的目标 1)学好英语:其实现在已经打开了新世界的大门,那2023希望能在英文的世界里无障碍畅游。 结论:时不时在读经济学人,也在玩reddit,但流利地使用英文还差的很远。下半年有开始准备雅思,这时候才是真正投入精力学习,半吊子学习压力太大了,还是得定个目标。12月份雅思考了5.5,明年怎么说也得考到6.5吧。 2)处理好感情:对象要毕业了,毕业之后面临的问题很多,希望能处理好。 感觉处理的蛮好的,未来的目标也大概达成统一了 3)保持健康:游泳和骑行坚持下来,再考虑下羽毛球 6月份换了房子之后,跟骑友们距离就远了,今年几乎没怎么骑车,游泳因为各种原因就更荒废了,好可惜。如果不是留学这个事儿,应该能坚持一个体育爱好的。 4)做精一件小事:比如kaggle的一个比赛 今年似乎没学啥东西,精在这儿胡思乱想了,想明白了留学这件事,明年一定要干好。 5)专业学习:AGI,强化学习,风控底层技术 无 6)读书:多读书,少刷视频 无,很难 7)影视:Netflix会员已入,come on 无 8)晚上早睡:唉 6月搬家之后,迫于通勤时长,必须早睡了 9)其他 播客:2022年写了34篇博客,2023年只写了11篇,很差劲 做了自己的第一个产品:top Hacker News)。但发现自己看的频率也不高,那就降低一下内容数量吧,或许能提高看的频率。 总结 感觉2022年的目标太多了,大部分都没办成,明年应该少定一些目标。 能完成的目标基本是外力催动,说明执行力还是比较差劲。 感觉是想清楚了留学这件事,明年一定要办成。 回忆一下,今年有什么记忆点呢?跟对象搬家,考雅思留学,就这两个了,记忆点太少了,虚度的一年。 2024 新的一年 雅思6.5,留学 健身 学习,学点新技术,虽然现在还没想好 读书、看电影,补充点文化 当每年的总结都成了流水账,不知道是否还有必要记录,2024年的目标,跟2023也没有差太多,感觉就像工作的OKR,每次都是那些。所以明年还是尽量要执行号,2025才能定一些新目标。

duckdb 看懂的第一个PR

作者 Benson
2023年10月22日 00:00
背景 duckdb 是关注到的第一个我自己平常能用到,也不是那么(四声)庞大的开源项目,刚好最近想尝试新的技术,所以就深入了解了一下。 duckdb对我的用法,就是能以本地文件为数据源,用sql语言进行查询、分析数据;平常类似的需求我都用python来解决,但远没有sql来的高效,所以duckdb还是很好地提升了我的工作效率。 第一次看duckdb issue 的时候就看到了这个,比较简单的一个bug,有人最小化复现了这个问题,就是使用 map 时,where 的限制条件无法对 map 字段生效,比如: create table data as from ( values ([1], [3]), ([2], [9]), ([3], [15]), ([4], [21]), ) as t(l, r); -- this works select l[1], r[1], map(l, r) from data; -- map output ignores where filter. select l[1], r[1], map(l, r) from data where r[1] != 15; Output: ┌───────┬───────┬───────────────────────┐ │ l[1] │ r[1] │ map(l, r) │ │ int32 │ int32 │ map(integer, integer) │ ├───────┼───────┼───────────────────────┤ │ 1 │ 3 │ {1=3} │ │ 2 │ 9 │ {2=9} │ │ 3 │ 15 │ {3=15} │ │ 4 │ 21 │ {4=21} │ └───────┴───────┴───────────────────────┘ ┌───────┬───────┬───────────────────────┐ │ l[1] │ r[1] │ map(l, r) │ │ int32 │ int32 │ map(integer, integer) │ ├───────┼───────┼───────────────────────┤ │ 1 │ 3 │ {1=3} │ │ 2 │ 9 │ {2=9} │ │ 4 │ 21 │ {3=15} │ └───────┴───────┴───────────────────────┘ 当时我只是挑了几个好上手 & 有趣的issue,就放下这事儿了,打算一周后有时间了再来看看怎么修复,但一周后再来看的时候,这个issue已经被人修复了,过来然开源社区的活跃度还是太强了。 想解决其他的issue,但看了半天都没有头绪。。。。于是就想看看当时这个map的issue是怎么解决的,看了半天也还是看不懂,就花了大概两个周末的时间研究了一下(一共可能花了1天),截止现在,终于看懂了核心内容。 历程 中间主要在 debug,想看看每个变量内具体的内容是什么,帮助理解代码,然后也就知道bug在哪儿,自然就知道怎么修复了。 duckdb 比较好的点是,没有依赖什么外部模块,花了点时间修了一个环境问题,我一次性就编译成功了。 进入主目录之后,执行 make debug 进行编译,./build/debug/duckdb 就是新编译好的文件,执行这个文件就进入了新编译的 duckdb中。 代码里可以用 printf 打印想要的debug日志,debug过程还比较顺利,能把 debug 信息打印出来,我就有信心能看懂这个代码了。 debug过程中,主要是在找哪个变量里存储的是核心的数据,过程中主要很多变量不知道要怎么打印,代码自定义了很多基础的类,需要翻代码,特别是 Vector 变量,还是个指针 Vector:自定义的向量类型,可以通过 *(vector.GetData() + i)打印第 i 个元素的值;通过 vector.GetType().ToString().c_str() 打印type Vector 还指定了不同的数据类型,我暂时还不知道要怎么按照特定类型打印,默认是 unsigned int8 类型,指针每次 +1 只移动 8 个字节。 PR 先说结论,其实核心就是最后的四行代码 if (key_vector.GetVectorType() == VectorType::CONSTANT_VECTOR) { // 省略 map_key_vector.Reference(expanded_const); value_vector.Flatten(count); map_value_vector.Reference(ListVector::GetEntry(value_vector)); } else if (value_vector.GetVectorType() == VectorType::CONSTANT_VECTOR) { // 省略 map_value_vector.Reference(expanded_const); key_vector.Flatten(count); map_key_vector.Reference(ListVector::GetEntry(key_vector)); } else { // 核心变更 key_vector.Flatten(count); // 对 key_vector 根据filter信息进行截断 value_vector.Flatten(count); // 对 key_vector 根据filter信息进行截断 map_value_vector.Reference(ListVector::GetEntry(value_vector)); // 把中间结果 value_vector 复制给最终结果 map_value_vector map_key_vector.Reference(ListVector::GetEntry(key_vector)); // 把中间结果 key_vector 复制给最终结果 map_key_vector } 第一次接触类似项目的人(两周前的我),很难看懂这四行代码对应的含义,我也是花了很久的时间定位到这里的。 结论 第一次完整看懂了一个 cpp 项目的PR,了解了完整的流程,比如 编译、debug、测试。 但除了核心数据,大部分的代码工作量都在补全其他类型的变量,这些需要对项目非常熟悉,才能补充的比较完整,需要对项目非常熟悉才行。

learning english in October

作者 Benson
2023年10月7日 00:00
10.7 The holiday is so so great, I almost fogot all thing except family, friends. in the first three days, I meet my family, relative and friends, the feeling that meet old friends is great. in the two days after that, I meet my gril friend’s family, talk thing about marriage, everyone is happy. in the last two days of holiday, we return back to beijing, clean up the apartment, cook and eat lunch. today, new life is beginning, I have to work back. Hope everything is fine. 10.29 I know I havn’t accept english as my day-to-day langague, because when I’m free, I like reading or watching media using chinese languge more than english, reading english text is a little hard for me, If I can’t push myself read it, I won’t do it. The good thing is that I already insist on memorizing english word for twenty-three days, It’s a big progress. duckdb have a few issue recently, it’s a complex project for beginner, so, I want discover some other project to develop, I try kaggle competition, there are two competitons that are interesting for me, but I don’t have time to explore them. I find another interesting project, I can convert podcast to text using speech-to-text tools, in some time, we don’t have the suitable environment to listen voice, in the most time, we can read text, and reading is more efficient for some person. If we want the information of podcast, reading text is a great path to get that. I try some open source tools, I can do this work using github actions and whisper.cpp open source tool for free, I don’t need pay for it. It’s great, I will do this work as my highest quality project. In summary, I have tried serveral project, Maybe I should list them in a specific page, show their status and priority. I write english notes fewer and fewer, do more and keep going!
❌
❌