Universities across the city are spending millions to give themselves digital makeovers into hotbeds of high-tech innovation and creativity. Cornell University, New York University, Columbia University and others are all building new academic programs targeted at the city’s burgeoning tech scene.
Over the last five years, the amount of digital information worldwide has increased almost 2,000 percent, exceeding 2.8 trillion gigabytes—perhaps as many bits of information as there are stars in the universe. Few modern trends have greater capacity to transform our economy and society than “big data,” a topic hundreds of Columbia faculty members and students, industry leaders and technological visionaries gathered to discuss in a daylong symposium April 5 in Low Library sponsored by the University’s new Institute for Data Sciences and Engineering.
Data science has been called the sexiest job in the 21st century, and by most accounts this hot new field promises to revolutionize industries from business to government, health care to academia.
Rachel Schutt, a senior research scientist at Johnson Research Labs, taught “Introduction to Data Science” last semester at Columbia. She described the data scientist this way: “a hybrid computer scientist software engineer statistician.” And added: “The best tend to be really curious people, thinkers who ask good questions and are O.K. dealing with unstructured situations and trying to find structure in them.”
Language is pervasive. It constitutes one of the most complex forms of human behavior and offers a rich problem domain for computational and data-driven approaches. Natural language processing (NLP) deals with the interactions between computers and human languages, often using machine learning to approach problems in text or speech. The vast amount of linguistic data now in electronic form is posing significant challenges (and opportunities) in how we manage and access this data through the use of NLP technologies.