（评论）

  > intractable Bayesian integral problems

With ML, most of what we are doing is modeling intractable distributions...

  > the multivariate calculus involved in NNs are primarily differentiation

Sure, but I'm not sure what your critique is here. This is confirming my point. Maybe I should have been clearer by adding a line that most people do not take calculus in high school. While it is offered there, these are the advance courses, and I'd be wary of being so pejorative. I know a large number of great mathematicians, computer scientists, and physicists who did not take calculus in high school. I don't think we need to discourage anyone or needlessly make them feel dumb. I'd rather encourage more to undertake further math education and I believe the lessons learned from calculus are highly beneficial in real world every day usage, without requiring explicit formula writing (as referenced in my prior post).

Which as a side note, I've found this is an important point and one of the most difficult lessons to learn to be an effective math teacher: Once you understand something, it often seems obvious and it is easy to forget how much you struggled to get to that point. If you can remember the struggle, you will be a better teacher. I also encourage teaching as revisiting can reveal the holes in your knowledge and often overconfidence (but the problem repeats as you teach a course for a long time). Clearly this is something that Feynman recognized and lead to his famous studying technique.

  > Doesn't matter, if it creates value

Value is too abstract and I think you should clarify. If you need a mine, digging it with a spoon creates value. But I don't understand your argument here and it appears to me that you also don't agree since you later discuss traditional (presumably GLMs?) statistics models vs ML. This argument seems to suggest that both create value but one creates _more_ value. And in this sense, yes I agree that it is important to consider what has more value. After all, isn't all of this under the broad scope of optimization? ;)

  > Pray tell me how discrete math and abstract algebra has anything to do with day to day ML research.

Since we both answered the first part I'll address the second. First, I'm not sure I claimed abstract algebra was necessary, but that's a comment about if you were going to argue with me about "math being a language". So miscommunication. Second off, there's quite a lot of research on equivalent networks, gradient analysis, interpretability, and so on that does require knowledge of fields, groups, rings, sets, and I'll even include measure theory. Like how you answered the first part, there's a fair amount of statistics.

  > Most statistics practitioners outside of graduate research will just be plugging formulas

And? I may be misinterpreting, but this argument suggests to me that you believe that this effort was fruitless. But I think you discount that the knowledge gained from this is what enables one to know which tools to use. Again, referencing the prior point in not needing to explicitly write equations. The knowledge gained is still valuable and I believe that through mathematics is the best way we have to teach these lessons in a generalizable manner. And personally I'd argue that it is common to use the wrong tools due to lack of nuanced understanding and one's natural tendency to get lazy (we all do it, including me). So even if a novice could use a flow chart for analysis, I hope we both realize how often the errors will appear. And how these types of errors will __devalue__ the task.

I think there is also an issue with how one analyzes value and reward. We're in a complicated enough society -- certainly a field -- that it is frequent for costs to be outsourced to others and to time. It is frequent to gain reward immediately or in the short term but have overall negative rewards in the medium to long term. It is unfortunate that these feedback signals degrade (noise) with time, but that is the reality of the world. I can even give day to day examples if you want (as well as calc), but this is long enough.

  > Are you sure? The traditional linear algebra (and similar) models never (or rarely) outperformed neural networks

I don't know how to address this because I'm not sure where I made this claim. Though I will say that there are plenty of problems where traditional methods do win out, where xgboost is better, and that computational costs are a factor in real world settings. But it is all about context. There's no strictly dominating method. But I just don't think I understand your argument because it feels non-sequitur.

  > A flapping bird wing...  [vs] static airfoils.

I think this example better clarifies your lack of understanding in areospace engineering rather than your argument. I'm guessing you're making this conclusion due to observation rather than from principles. There is a lot of research that goes into ornithopters, and this is not due to aesthetics. But again, context matters; there is no strictly dominating method.

I think miscommunication is happening on this point due to a difference in usage of "elegance." If we reference MW, I believe you are using it with definition 1c while I'm using it with 1d. As in, it isn't just aesthetics. There's good reason nature went down this path instead of another. It's the same reason the context matters, because all optimization problems are solved under constraints. Solution spaces are also quite large, and as we've referenced before, in these large intractable spaces, there's usually no global optima. This is often even true in highly constrained problems.

  > more knowledge is always a good thing

Glad we agree. I hope we all try to continually learn and challenge our own beliefs. I do want to ensure we recognize the parts of our positions that we agree upon and not strictly focus on the differentiation.

  > ML research can only be advanced by people with "mathematical maturity"

No such claim was ever made and I will never make such a claim. Nor will I make such a claim about any field. If you think it has, I'd suggest taking a second to cool off and reread what I wrote with this context in mind. Perhaps we'll be in much more agreement then. (specifically what I tell my students and the meaning of the referenced "all models are wrong but some models are useful".) Misinterpretation has occurred. The fault can be mine, but I'm lacking the words to adequately clarify so I hope this can do so. I'm sorry to outsource the work to you, but I did try to revise and found it lacking. I think this will likely be more efficient. I do think this is miscommunication on both sides and I hope we both can try to minimize this.

（评论） (comments)

（评论）
(comments)