博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
dropbox 怎么使用_我的论文写作时间表–使用Dropbox和Python分析
阅读量:2520 次
发布时间:2019-05-11

本文共 8808 字,大约阅读时间需要 29 分钟。

dropbox 怎么使用

I wrote my in LaTeX, and stored all of the files in my Dropbox folder. Dropbox stores previous versions of your files – for up to 30 days if you are on their free plan. Towards the end of my PhD, I realised that I could write a fairly simple Python script that would grab all of these previous versions, which I could then use to do some interesting analyses. So – over a year after my thesis was submitted, I’ve finally got around to looking at the data.

我用LaTeX撰写了 ,并将所有文件存储在我的Dropbox文件夹中。 Dropbox可以存储文件的早期版本-如果您使用免费计划,则最多可以存储30天。 在我博士毕业时,我意识到我可以编写一个相当简单的Python脚本,该脚本可以捕获所有这些以前的版本,然后可以用来进行一些有趣的分析。 因此,在论文提交一年后,我终于可以开始研究数据了。

I should point out here that this data comes from a sample size of one – and so if you’re writing a PhD thesis then don’t compare your speed/volume/length/whatever to me! So, with that disclaimer, on to how I did it, and what I found…

我在这里需要指出的是,这些数据来自一个样本大小的样本,因此,如果您正在写博士学位论文,那么请不要对我的速度/体积/长度/任何东西进行比较! 因此,有了该免责声明,我将了解如何实现以及发现的内容...

获取数据 (Getting the data)

I wrote a nice simple class in Python to grab all previous versions of a file from Dropbox. It’s available in the on Github – and can be used entirely independently from the LaTeX analysis that I did. It is really easy to use, just grab the file, install the Dropbox library (pip install Dropbox) and run something like this:

我用Python编写了一个很好的简单类,以从Dropbox获取文件的所有先前版本。 它可以在Github上的使用-可以完全独立于我所做的LaTeX分析使用。 它真的很容易使用,只需抓取文件,安装Dropbox库( pip install Dropbox ),然后运行以下命令:

from DropboxDownloader import DropboxDownloader# Initialise the object and give it the folder to store its downloads ind = DropboxDownloader('/Users/robin/ThesisFilesDropboxLog')# Download all available previous versionsd.download_history_for_files("/Users/robin/Dropbox/_PhD/_FinalThesis",  # Folder containing files to download                             "*.tex",  # 'glob' string specifying files to download                             "/Users/robin/Dropbox/")  # Path to your Dropbox folder

The code inside the DropboxDownloader class is actually quite simple – it basically just calls the revisions method of the DropboxClient object, does a bit of processing of filenames and timestamps, and then grabs the file contents with the get_file method, making sure to set the rev parameter appropriately.

在DropboxDownloader类中的代码其实很简单-它基本上只是调用DropboxClient对象的修改方法,确实有点名和时间戳的处理,然后抓住与get_file方法的文件内容,确保设定的参数适当。

数词 (Counting the words)

Now we have a folder (or set of folders) full of files, we need to actually count the words in them. This will vary significantly depending on what typsetting system you’re using, but for LaTeX we can use the wonderful . You’ll probably find it is installed automatically with your TeX distribution, and it has a very comprehensive set of documentation that I’ll let you go away and read…

现在我们有了一个充满文件的文件夹(或一组文件夹),我们需要实际计算其中的单词。 具体取决于您使用的排版系统,但是对于LaTeX,我们可以使用出色的 。 您可能会发现它随TeX发行版一起自动安装,并且它具有非常全面的文档集,我将让您阅读并阅读……

For our purposes, we wanted a simple output of the total number of words in the file, so I ran it as:

出于我们的目的,我们希望简单输出文件中单词的总数,因此我将其运行为:

texcount -brief -total -1 -sum file.tex

texcount -brief -total -1 -sum file.tex

I ran this from Python using (far better than os.system!) for each file, combining the results into a Pandas DataFrame.

我从Python使用 (远   比os.system更好!),将结果合并到Pandas DataFrame中。

做分析 (Doing the analysis)

Now we get to the interesting bit: what can we find out about how I wrote my thesis. I’m not going to go into details about exactly how I did all of this, but I will occasionally link to useful Pandas or NumPy functions that I used.

现在我们来看看有趣的一点:我们如何了解我的论文写作方式。 我不会详细介绍所有操作的确切方式,但有时会链接到我使用的有用的Pandas或NumPy函数。

When you get hold of some data – particularly if it is time-series – then it is always good to plot it and see what it looks like. The pandas plot function makes this very easy – and we can easily get a plot like this:

当您掌握一些数据时(尤其是时间序列的数据),将其绘制并查看其外观总是好事。 熊猫图功能使此操作非常容易–我们可以轻松获得如下图:

TotalCount_OverTime

This shows the total word count of my thesis over time. I didn’t have the idea of writing this code until well into my PhD, so the time series starts in June 2014 when I was busy working on the practical side of my PhD. By that point I had already written some chapters (such as the literature review), but I didn’t really write anything else until early August (exactly the 1st August, as it happens). I then wrote quite steadily until my word count peaked on the 18th September, around the time that I submitted my final draft to my supervisors. The decrease after that was me removing a number of ‘less useful’ bits on advice from them!

这显示了我的论文随着时间推移的总字数。 我直到写完博士学位才​​有编写此代码的想法,所以时间序列从2014年6月开始,那时我正忙于博士学位的实践工作。 到那时,我已经写了一些章节(例如文献综述),但是直到8月初(确切地说是8月1日),我才真正写了其他任何内容。 然后,我很稳定地写信,直到我的字数在9月18日达到顶峰,也就是我将最终草案提交给主管的时候。 之后的减少是我从他们的建议中删除了一些“不太有用”的内容!

Overall, I wrote 22,317 words between those two dates (a period of 48 days), which equates to an average of 464 words a day. However, on 22 of those days I wrote nothing – so on days that I actually wrote, I wrote an average of 858 words. My maximum number of words written in one day was 2,516, and the minimum was was -7,139 (when I removed a lot!). The minimum-non-zero was 5 words…that must have been a day when I was lacking in inspiration!

总体而言,我在这两个日期(为期48天)之间写了22,317个单词,相当于平均每天464个单词。 但是,在那22天中,我什么都没写-因此,在我实际写的那一天,我平均写了858个单词。 我一天最多写的单词数是2,516,而最小写数是-7,139(当我删除很多单词时!)。 最小非零值为5个单词……那一定是我缺乏灵感的一天!

一些有趣的图 (Some interesting graphs)

One thing that I thought would be interesting would be to look at the total number of words I wrote each day of the week:

我认为很有趣的一件事是查看我一周中每一天写的总单词数:

WordsPerDOW

This shows a very noticeable tailing off as the week goes on, and then a peak again on Saturday. However, as this is a sum over the whole period it may hide a lot of interesting patterns. To see these, we can plot a heatmap showing the total number of words written each day of each week:

随着一周的进行,这显示出非常明显的尾声,然后在星期六再次达到高峰。 但是,由于这是整个期间的总和,因此可能会隐藏许多有趣的模式。 要查看这些内容,我们可以绘制一个热图,显示每个星期每天写的单词总数:

WordsDayWeekHeatmap

It seems like weeks 6 and 7 were very productive, and things tailed off gradually over the whole period, until the last week when they suddenly increased again (note that some of the very high values were when I copied things I’d written elsewhere into my main thesis documents).

似乎第6周和第7周的工作效率很高,在整个期间内,事情逐渐逐渐消失,直到最后一周突然又增加了(请注意,其中一些非常高的价值是当我将我在其他地方写的东西复制到我的主要论文文件)。

Looking at the number of words written over each hourly period is very easy in Pandas by grouping by the hour and then applying the ohlc function (Open-High-Low-Close), and then subtracting the Open value (number of words at the start of the hour) from the Close value (number of words at the end of the hour). Again, we can look at the total number of words written in each hour – summed across the whole period:

在熊猫中,通过按小时分组然后应用ohlc函数(打开-高-低-关闭),然后减去打开值(开头的单词数),可以很容易地查看每个小时内每个小时写入的单词数。关闭值(小时结束时的单词数)中的小时数)。 同样,我们可以查看每个小时所写的单词总数–整个期间的总和:

TotalWordsPerHour

This shows that I had a big peak just after lunchtime (I tend to take a fairly early lunch around 12:00 or 12:30), with some peaks before breakfast (around 8:00) and after breakfast (10:00) – and similarly around the time of my evening meal (18:00), and then increasing as a bit of late work before bed. Of course, this shows the total contribution of each of these hours across the whole writing period, and doesn’t take into account how often I actually did any writing during these periods.

这表明我在午餐时间之后出现了一个高峰(我倾向于在12:00或12:30左右享用相当早的午餐),在早餐之前(大约8:00)和早餐之后(10:00)有一些高峰–大约在我晚饭时间(18:00)左右,然后由于睡前的一些较晚的工作而增加。 当然,这显示了整个写作期间每个小时的总贡献,而没有考虑到我在这些时期内实际进行写作的频率。

To see that we need to look at the mean number of words written during each hourly period:

要查看,我们需要查看每个小时的平均单词数:

MeanWordsPerHour

This still shows a bit of a peak around lunchtime, but shows that by far my most productive time was early in the morning. Basically, when I wrote early in the morning I got a lot written, but I didn’t write early in the morning very often!

这仍然显示了午餐时间的高峰,但表明到目前为止,我最有生产力的时间是清晨。 基本上,当我一大早写书的时候,我写了很多东西,但我却不是很经常写!

As before, we can look at this in more detail in a heatmap, in this instance by both hour of the day and day of the week:

和以前一样,我们可以在热图中更详细地了解这一点,在这种情况下,可以按一天中的小时和一周中的某天进行查看:

WordsDayHourHeatmap

You can really start to see my schedule here. For example, I rarely wrote much on Sunday mornings because I was at church, but wrote quite effectively once I got back from work. I wrote very little around my evening meal time, and wrote very little on Monday mornings or Friday afternoons – which makes sense!

您真的可以在这里开始查看我的日程安排。 例如,由于我在教堂里,我很少在星期天的早晨写很多东西,但是一旦我下班回来就写得很有效。 我在晚餐时间写的很少,而在星期一早上或星期五的下午写的很少–这很有意义!

So, I hope you enjoyed this little tour through my thesis writing. All of the code for grabbing the versions from Dropbox is available , along with a (very badly-written and badly-documented) notebook.

因此,我希望您能通过我的论文写作而享受这次小旅行。 上提供了从Dropbox获取版本的所有代码,以及一个(非常不正确书写和不完全记录)笔记本。

翻译自:

dropbox 怎么使用

转载地址:http://mwhwd.baihongyu.com/

你可能感兴趣的文章
Linux IPC实践(3) --具名FIFO
查看>>
Qt之模拟时钟
查看>>
第一次接触安卓--记于2015.8.21
查看>>
(转)在分层架构下寻找java web漏洞
查看>>
mac下多线程实现处理
查看>>
C++ ifstream ofstream
查看>>
跟初学者学习IbatisNet第四篇
查看>>
seL4环境配置
查看>>
Git报错:insufficient permission for adding an object to repository database .git/objects
查看>>
ajax跨域,携带cookie
查看>>
python 下载远程日志
查看>>
BZOJ 1600: [Usaco2008 Oct]建造栅栏( dp )
查看>>
nginx 高并发配置参数(转载)
查看>>
Jquery异步请求数据实例
查看>>
洛谷 CF937A Olympiad
查看>>
bzoj 3876: [Ahoi2014]支线剧情
查看>>
file_get_contens POST传值
查看>>
关于overflow:hidden
查看>>
【SpringBoot学习笔记】注解的作用——@FeignClient
查看>>
Java集合总结
查看>>