Time Travel: 2014

Chapter 339 Alternative academic style

If it is as Eve Carly guessed.

What Lin Hui created was not only amazingly efficient.

Its extended functions will also be outrageous

You must know that things based on the idea of ​​migration can be "migrated" in a sense, which is portability.

This is so fucking outrageous.

Research involving text summarization and even the entire field of natural language processing used to be more or less self-centered.

But if it is portable, it is entirely possible to penetrate into other fields.

Thinking of this, Eve Carly suddenly felt that Lin Hui's focus must not be on the small fish pond of natural language processing.

Lin Hui is playing a big game of chess.

Although he had known Lin Hui for a short time, as someone who had frequent academic exchanges with Lin Hui.

Eve Carly was certain that Lin Hui's academic ambitions were huge.

Previously, Eve Carly felt that Lin Hui could open a new door in the direction of natural language processing.

It now seems that the direction Lin Hui will influence in the future is definitely not just the direction of natural language processing.

When it comes to the entire field of machine learning, Lin Hui will make great achievements.

It may even be much more than that, and Eve Carly is looking forward to it all.

There is nothing more exciting than witnessing the rise of a genius.

(If there is, it may only be witnessing the destruction of a "god".)

Even though Lin Hui doesn't have any titles yet.

However, Lin Hui's achievements in the past have been dazzling enough.

Eve Carly believes that Lin Hui will realize his ambition bit by bit.

Why can Eve Carly make such a judgment?

Lin Hui's brilliant academic achievements in the past are just one of the reasons why Eve Carley came to this conclusion.

This is not the most important reason.

What really allowed Eve Carly to conclude that Lin Hui could realize his ambition was that Lin Hui had his own academic style.

Compared to visible academic achievements.

Academic style is a very metaphysical thing that cannot be seen or touched.

Something that sounds illusory.

But academic style does exist.

Discussions about the term "academic style" often appear in various academic exchanges and daily discussions among scientific researchers.

Whether it is academic routes or academic habits, these things will affect the formation of academic style in some sense.

To measure whether a researcher is above or below the standard in academic terms, it generally depends on whether he or she has an independent academic style.

In general, scientific researchers who are just scratching the surface in academic terms generally do not have their own academic style.

His research results are more arbitrary, and his research topics are mainly "follow-up research".

Researchers above this level generally have a stable academic style.

Stability of academic style does not mean everything though.

But at least it means that the researcher has a relatively clear plan for the academic route.

Perhaps Lin Hui himself did not notice his academic style.

But Eve Carly feels that Lin Hui has his own academic style.

And the style is very obvious.

The fact that Lin Hui has an academic style can also reflect the stability of his academic line.

Therefore, Eve Carly believed that Lin Hui could realize his ambition step by step.

And what kind of academic style does Lin Hui have?

Eve Carly is too specific and cannot be accurately described for the time being.

But in terms of academic habits, Eve Carly felt that Lin Hui had a very distinctive characteristic.

That is Lin Hui is always committed to winning at the starting line.

Of course, winning at the starting line is just a metaphor. The exact expression should be

——When solving academic problems and practical engineering problems, Lin Hui is very inclined to nip possible problems in the bud.

Eve Kali naturally has the corresponding basis for making this judgment.

Take the pre-training mentioned by Lin Hui in the supplementary content of the paper not long ago.

In the past, when it came to "training", people often thought that the model generated by the training was adjusted by machine learning experts.

There are few people like Lin Hui who have such thoughtful ideas about the training process.

After all, research involving corpus training is already a very early step in the normal steps of building a language model.

This example already illustrates Eve Carly’s judgment.

In addition to this example, there is also the first conversation with Lin Hui after coming to China.

At that time, the two talked about how to deal with issues related to "dimensional explosion that may result from processing the corpus after vectorization."

The original dimensionality reduction methods envisioned by Eve Carly include converting high-dimensional models into low-dimensional models, reducing high-dimensional data obtained after analysis into low-dimensional data, and so on.

The idea proposed by Lin Hui is to vectorize the corpus to obtain the original high-dimensional vector data and directly perform dimensionality reduction processing.

You must know that when it came to dimensionality explosion before, few researchers thought of directly making a fuss about the original data with relatively high dimensions.

After all, this involves abstracting corpus information into vector raw data, which is almost a particularly advanced step in corresponding research.

Eve Carly felt that these could support her previous judgment.

Based on her previous judgment, further inferences can be made on this basis.

If a scientific research project involves multiple links, each link has room for action.

Then Lin Hui will definitely put in the main effort in the initial stage or open up a new track before the initial stage.

What's the use of knowing this?

Of course it is useful, even very useful.

Previously, Eve Carly was very unclear as to why Lin Hui wanted to acquire the patent she created, namely "A New Method for Text Judgment, Screening and Comparison".

After Lin Hui proposed a generative text summarization algorithm.

Current automatic summarization implementation methods are mainly divided into extractive methods and generative methods:

There are many differences in principle and practical performance between these two summarization methods.

But both are essentially automatic text summarization.

For all automatic text summarization, its technical framework can be summarized as:

Content representation → weight calculation → content selection → content organization.

Content representation is the process of dividing original text into text units, mainly preprocessing work such as word segmentation, words, sentences, etc.;

The main purpose of content representation is to process raw text into a form that is easy for algorithms to analyze through preprocessing.

Weight calculation is to calculate the corresponding weight score for the text unit (i.e., the original text after preprocessing). The weight is calculated in various ways, such as calculating the weight based on feature scores, sequence annotations, classification models, etc. extracted content features.

The purpose of this step is to complete a preliminary analysis of the preprocessed original text through this series of calculations.

Content selection is to select a corresponding subset of text units from the weighted text units (that is, the text analyzed by step II weight) to form a summary candidate set, which can be based on the required summary length, linear programming, submodular functions, and heuristics Algorithms, etc. select text units;

Content organization refers to organizing the content of the candidate set to form a final summary, which can be output in order according to the word count requirements. Some researchers have also proposed using methods based on semantic information, templates and neural network learning to generate summaries that meet the requirements.

Judging from the corresponding descriptions of these levels of the technical framework, we can see that weight calculation, content selection, and content organization are all very important.

If you can't figure out the weight calculation and content selection, you won't know exactly where to summarize the text when summarizing.

If the content organization is not handled properly, it will directly affect the user experience.

Under this circumstance, people in this time and space do pay more attention to the three aspects of weight calculation, content selection and content organization of automatic text summarization when conducting research on automatic text summarization.

Tap the screen to use advanced tools Tip: You can use left and right keyboard keys to browse between chapters.

You'll Also Like