（评论）

（评论）
(comments)

原始链接: https://news.ycombinator.com/item?id=41478690

有人提议改变学术出版物在线平台的结构。当前的设计在加载时显示“趋势”页面，要求用户导航到其他地方以获得完整的论文列表。相反，建议在主页上显示论文列表，类似于 Hacker News。此外，排名算法应优先考虑用户投票，而不是根据评论活动进行排名。此外，用户名系统允许使用空格，这可能会使注释中某些“@”功能的实现变得复杂。建议从 PDF 转向 HTML，认为它为重新格式化、导航和可访问性提供了更大的灵活性。然而，人们对放弃 PDF 感到担忧，因为它是共享学术作品的标准格式。学者们更喜欢文档中特定位置的一致性，例如 “第7页底部”或“定理1.2之后的三行”，这些都是讨论和分析过程中的重要参考点。尽管如此，arXiv 正在开发一种转换工具，用于将 LaTex 文档转换为 HTML，为寻求这两种格式的读者提供中间立场。提出的另一个问题涉及论文的组织和分类。有人建议该平台不应仅仅关注个别论文，而应促进更具协作性的工作环境，鼓励各个阶段的思想交流和发展。最后，对学术界现有系统的批评凸显了纯粹通过引用计数、H 指数和奖项等指标评估研究质量所面临的挑战。相反，越来越多的人认为，主题专家的定性评估对于准确和全面的评估至关重要。

Great idea.

- The frontpage should directly show the list of papers, like with HN. You shouldn't have to click on "trending" first. (When you are logged in, you see a list of featured papers on the homepage, which isn't as engaging as the "trending" page. Again, compare HN: Same homepage whether you're logged in or not.)

- Ranking shouldn't be based on comment activity, which ranks controversial papers, rather papers should be voted on like comments.

- It's slightly confusing that usernames allow spaces. It will also make it harder to implement some kind of @ functionality in the comments.

- Use HTML rather then PDF. Something that could be trivial with HTML, like clicking on an image to show a bigger version, requires you to awkwardly zoom in with PDF. With HTML, you would also have one column, which would fit better with the split paper/comments view.

> Use HTML rather then PDF.

The PDF is the original paper, as it appears on arXiv, so using PDF is natural.

In general academics prefer PDF to HTML. In part, this is just because our tooling produces PDFs, so this is easiest. But also, we tend to prefer that the formatting be semi-canonical, so that "the bottom of page 7" or "three lines after Theorem 1.2" are meaningful things to say and ask questions about.

That said, the arXiv is rolling out an experimental LaTeX-to-HTML converter for those who prefer HTML, for those who usually prefer PDF but may be just browsing on their phone at the time, or for those who have accessibility issues with PDFs. I just checked this out for one of my own papers; it is not perfect, but it is pretty good, especially given that I did absolutely nothing to ensure that our work would look good in this format:

https://arxiv.org/html/2404.00541v1

So it looks like we're converging towards having the best of both worlds.

> In general academics prefer PDF to HTML. In part, this is just because our tooling produces PDFs, so this is easiest.

The tooling producing PDF by default absolutely makes the preference for PDF justifiable. However, tooling is driven by usage - if more papers come with rendered HTML (e.g. through Pandoc if necessary), and people start preferring to consume HTML, then tooling support for HTML will improve.

> But also, we tend to prefer that the formatting be semi-canonical, so that "the bottom of page 7" or "three lines after Theorem 1.2" are meaningful things to say and ask questions about.

Couldn't you replace references like "the bottom of page 7" with others like "two sentences after theorem 1.2" that are layout-independent? This would also make it easier to rewrite parts of the paper without having to go back and fix all of your layout-dependent references when the layout shifts.

HTML has strong advantages for both paper and electronic reading, so I think it's worth making an effort to adopt.

When I print out a paper to take notes, the margins are usually too narrow for my note-taking, and I additionally have a preference for a narrow margin on one side and a wide margin on the other (on the same side, not alternating with page parity like a book), which virtually no paper has in its PDF representation. When I read a paper electronically, I want to eliminate pagination and read the entire thing as a single long page. Both of these things are significantly easier to do with HTML than LaTeX (and, in the case of the "eliminate pagination" case, I've never found a way to do it with LaTeX at all).

(also, in general, HTML is just far more flexible and accessible than PDF for most people to modify to suit their preferences - I think most on HN would agree with that)

HTML still lacks one key feature: a way of storing the entire document as a single file that remains fully functional offline and can be reasonably expected to be widely supported for decades. Research papers are used both for communicating new results and for archiving them. The long-term stability needed for the latter has never been a strong point of web technology.

Indeed, I posted my first paper in 2006. It is still live on the internet in exactly the same format, and I've done absolutely nothing to maintain it.

I'm guessing there are few web pages of any significance which need to stay exactly the same for a long time. Here is one example which I've seen trotted out from time to time on HN:

https://www.dolekemp96.org/main.htm

This is clearly the exception. It seems that maintainers of web pages usually expect that they'll need to maintain and update them for as long as they want them to be accessible, and that's definitely not something I'd care to do for research papers.

You can make an HTML file self-contained by embedding CSS in a `

（评论） (comments)

（评论）
(comments)