Time Travel: 2014
Chapter 155 The pursuing pursuer (Part 2)
Moreover, the speed gap with the LIN HUI algorithm is still within the scope of Harley Price's understanding.
The gap in accuracy with the LIN HUI algorithm is truly despairing. .
The accuracy of the X1 algorithm developed by Harley Price and others is not even comparable to the summary accuracy of the algorithm used in the Yahoo News summary created by that idiot Lian Nick.
This made Harley Price very depressed.
…
After a while, Harley Price suddenly had an idea and shouted to Eclair Kilcarga:
"Dear buddy, do you think the problem lies in the accuracy measurement standard set by LIN HUI?
If that accuracy measure is applied, only LIN HUI's own algorithm will get high scores using that measure..."
Eclair Kilcarga:. . .
Eclair Kilcaja: "Maybe you have your basis for your idea, but I suggest you go to sleep now... You may be a little confused. What is the reason that makes you think that a standard reviewed by a standards committee will be What about an unfair standard?”
Harley Price: "Because that LIN HUI is from China, they can do anything. I remember that some mobile phone manufacturers in their country will develop a test software in order to claim that their mobile phones are powerful. Only they can use that test software." Only when mobile phones are made by manufacturers can they get high scores.
In my opinion, the LIN HUI model for measuring standards is similar to such a test software..."
Harley Price continued: "In short, I think the LH text summarization accuracy measurement model is very detrimental to us.
Maybe we can create our own measurement standard based on the ideas of LIN HUI..."
Eclair Kilcaja: “I have thought about the problem you mentioned.
However, it is not easy to build a model based on the LIN HUI construction standard process.
If a similar standard is built according to the idea of LIN HUI.
First we need to use a language model to evaluate the fluency of the algorithm-generated language, and then...
If we follow the same steps for model building.
It is very likely that it will get stuck directly on the construction of the language model.
After all, our corpus is really inferior...
A report from MIT NLP, which we previously collaborated with
It also proves that it is not feasible to build a language model according to the LIN HUI idea. "
Harley Price: “Just because the guys at MIT think it’s not feasible doesn’t necessarily mean it’s not feasible.
They are most likely just avoiding their responsibilities.
Anyway, I think we can try to draw on the ideas of LIN HUI to create a new measurement standard. "
Eclair Kilcaja: “Are you sure we can come up with a new model along the lines of LIN HUI?
How can you guarantee that the model we create will not be exactly the same as the one he created? "
Harley Price: “We need to go down this road anyway.
If we can't even reproduce his model for measuring accuracy.
How do we know if there is anything fishy about this model? "
Harley Price continued: “In the past, our corpus may have been very low.
But there is nothing wrong with the corpus we use now.
Now the Natural Language Center at the University of California, Berkeley, is working with us.
When we tested the X1 verification algorithm, we used a corpus consisting of 100,000 text-summary sequences as a training set..."
Eclair Kilcaja retorted: “No, no, no, that’s not enough!
To reach the level of text processing by the LIN HUI algorithm, we need at least a corpus composed of millions of text-summary sequences as a training set.
And this is just the tip of the iceberg.
We also need to construct a 10^4-level text-summary sequence with manual scoring labels as a validation set.
and a 10^3-level human cross-scored consistent text-summary sequence as a test set.
Otherwise, our measurement model may not reach the level of confidence achieved by LIN HUI. "
Harley Price: “You do have a point!
The most practical way to reduce the margin of error is to increase the sample size.
A corpus composed of millions of text-summary sequences is easy to say.
This is compared to a corpus of 100,000 levels.
Build difficulty just increases linearly.
But are you sure we want to build as large a manually labeled validation set and test set as you mentioned?
It is conservatively estimated that it will take us nearly a month to build just the text-summary sequence verification set with manual scoring labels.
This is only possible if we and other linguistics majors work together and do not create any rift.
It is even more difficult to achieve a consistent text-summary sequence test set involving 10^3 levels of manual cross-scoring.
Previously we have only built level 10^2.
Every time the construction of the test set increases by one order of magnitude, the corresponding construction difficulty increases exponentially.
It took us nearly two months to build a test set of 150 texts with consistent cross-scoring for testing the extractive summarization algorithm. "
And why do we need to introduce artificial elements?
Wouldn't this be equivalent to returning to the old path of developing subjective accuracy criteria? "
Eclair Kilcaja: “That’s exactly what I meant.
Originally, I thought it was impossible to come up with a new measurement standard based on the LIN HUI idea.
Even if we can follow the technical route of LIN HUI.
You will also face an overwhelming workload. "
Hear the words of Eclair Kilcaja.
Harley Price despairs: “So just the initial work of establishing accuracy measures is going to cost us a lot of time?
But it is impossible for the senior executives responsible for decision-making to sit back and watch us waste too much time on this algorithm.
They are likely to directly seek algorithm authorization for LIN HUI.
For those business elites, technology is just an addition to the capital game.
When they get the new technology of LIN HUI, we will probably be miserable...
What on earth should we do? "
Eclair Kilcarga: "Who knows? Maybe we should pack up and get ready to go to Yin."
Harley Price: "It's not bad to go to Y degree. I heard that Google Africa Research Center is being built recently.
If we're not lucky, we may have to go to Africa. "
Eclair Kilcarga:. . .
Of course, these words are just a joke.
After all, he is also a researcher at a top research institution.
Eclair Kilcarga did not lose his fighting spirit so easily.
After a while, Eclair Kilcaja said: “It’s not completely impossible to do anything.
I don’t think we should follow the technical route of LIN HUI.
This LIN HUI is so cunning!
The information he made public may well be left to mislead us.
What we need to do now is to clarify some of the conclusions we have drawn on our own. "
(●''●)
You'll Also Like
-
Abnormal Food Article
Chapter 231 2 hours ago -
Peerless Tangmen: Dragon Bear Douluo
Chapter 153 4 hours ago -
Douluo: The Peerless Tang Sect dug out Yu Xiaogang
Chapter 212 4 hours ago -
Douluo started from being accepted by Bibi Dong as a disciple
Chapter 35 4 hours ago -
Douluo's super god level choice
Chapter 94 4 hours ago -
Douluo Continent on the tip of the tongue
Chapter 594 4 hours ago -
Douluo: My mother is the time traveler
Chapter 215 4 hours ago -
Douluo: Rebellious son of the Tang family
Chapter 668 4 hours ago -
Zhu Zhuqing of Douluo started to sign in
Chapter 149 4 hours ago -
Disabled Mr. Zhan is the Child’s Father, It Can’t Be Hidden Anymore!
Chapter 672 15 hours ago