简书备份（导出）所有文章和图片的方法

1.场景：

我在简书写了很多文章，考虑到文章的安全性，我希望定期备份下载我的文章，那么该怎么做呢？

2.整体思路

2.1.问题分解及实现思路

下载所有文章: 采用官方提供
下载所有图片：写个python脚本

3.操作步骤

3.1.设备环境

我的简书文章使用的 markdown 格式
电脑安装 Python 环境

3.2.第一步：下载所有文章

登录到简书 -> 点击个人头像 -> 选择设置 -> 账号管理 -> 点击下载所有文章

按下图操作

image.png

下载后的样子

image.png

3.3.第二步：写个python脚本

分解一些操作：

遍历文件夹和文件
打开文件，逐行读取
识别检索图片的描述符号，获得图片url，下载。

python 遍历文件夹

  for root, dirs, files in os.walk(dir_name):

python 遍历文件夹

  for root, dirs, files in os.walk(dir_name):

python 逐行读文件

f = open(a_markdown_file)
line = f.readline()
i = 0
while 1:
    line = f.readline()
    if not line:
        break
    i = i + 1
    ln = line[:-1]
    # print("[{}] [{}]".format(i, ln))
    process_line(ln, output_dir)
f.close()

markdowni的图片描述是下面这样样子，我们需要个正则表达式。

![image.png](https://upload-images.jianshu.io/upload_images/2044033-48c2eae384fc250c.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

使用这个正则：

    img_list = re.findall(r"\!\[[^\]]*\]\((.+?)\)", line, re.S)

4.完整的 python 脚本：

我托管到GIthub，点击：完整代码