Connect with us


[Techie Tuesday] Meet IISc’s Prasanta Kumar Ghosh whose patented speech and voice technology products are helping cancer patients



Prasanta Kumar Ghosh, Associate Professor at the Indian Institute of Sciences (IISc), Bengaluru, has developed several patented voice technology using Artificial Intelligence (AI), Machine Learning (ML), and Augmented Reality (AR). But his love for science and technology developed early on when he was in school.

“As my father days had struggled significantly to get the right education, he made sure that I was well educated and had the right tutoring and mentorship. But what really excited me in my high school days in 1996 was electronics and the work that ISRO was doing then,” Prasanta tells YourStory.

His father was a government employee and his mother was a homemaker. For Prasanta, the aim was to get a job immediately after graduation. Despite his love for the work ISRO was doing at that time, he, however, couldn’t take up the organisation’s offer as he was already working in a different organisation.

“After graduating in electrical engineering from Jadavpur University in 2003, it was exceedingly important for us that I find a job. I had started applying everywhere I could, which gave me job offers from different places, and I started working at Usha Comp Private Limited in Kolkata,” he narrates.

Prasanta Kumar Ghosh presenting poster titled “Automatic classification of question turns

in spontaneous speech using lexical and prosodic evidence” in ICASSP 2008.

The world of research

However, he was never interested in taking up a job and rather wanted to research and work on electrical engineering and newer technologies. “I explained to my father that quitting the job may seem like a tough call, but in the long run, it would pay more dividends,” explains Prasanta.

Thereafter, he attempted the IISc entrance exam to pursue post-graduation studies.

“My rank was 489, thus I missed out on a lot of IITs and even IISc. My friends joined IISc and they kept telling me that there was a vacancy for a research position at the institute. I cleared the exam in 2004, and then was selected for the programme,” he says.

It was during this time that he got an offer from ISRO. While pursuing his MSc at IISc and simultaneously working there, he realised he could build and work on solving significantly larger problems.

“The faculty members at IISc were nothing short of inspirational. Their style of teaching and the way they inspired people to pursue research made me fall in love with the field, and I decided to take on the role of a researcher,” adds Prasanta.

This meant putting in a lot of hard work academically. Despite having a job offer from a startup, Prasanta decided to continue on the academic route. He went on to become a research intern at Microsoft Research India where he focused on the area of audio-visual speaker verification in 2006.

Prasanta Kumar Ghosh on commencement at the University of Southern California in 2011.

Speech compression

“I focused my research and work on speech compression. When you speak on a phone today and record the conversation, the voice is transmitted to your friend after compressing the audio. My work is around non-uniform sampling-based compression. Any waveform can be sampled across three key locations. You don’t have to look at the whole signal or all the samples, but the key locations are compressed and reconstructed,” explains Prasanta.

He went on to publish his research which fetched him a thesis award. It also made him realise that he could do more work in the space. “It got me to look at the other options in the field of speech and I started looking at places in the US where I could do my PhD,” he adds.

He went on to receive his PhD in electrical engineering from the University of Southern California (USC), Los Angeles in 2011. It was there he learned how different interdisciplinary work can be done.

Multipdisiplinary approach

“I worked at the intersection of science and technology. I worked with linguists, engineers, mathematicians, and others to build speech-recognition technology. I understood how an FM (frequency modulation) transmitter generates signals and this was the base for understanding how human speech worked,” explains Prasanta.

He also had experience working on a special electromagnetic programme at USC that would record and track the motion of the lips and tongues, and jaw movements while speaking. This further led to building the different voice recognition modules.

“I had this idea when I was at LA which has a large Hispanic population that preferred speaking Spanish than English. I had a project where the speech of the doctor, which was in English, could be translated to Spanish so that the patient could understand them,” he explains.

During 2011-2012, Prasanta was with IBM India Research Lab (IRL) as a researcher. He was also awarded the INSPIRE Faculty Fellowship from the Department of Science and Technology (DST), Government of India in 2012.

“At IBM I worked more around intent classification in speech. For example, if someone asks ‘should I carry an umbrella tomorrow’, they actually want to know the weather for tomorrow,” says Prasanta. He also worked on text analytics and its intent.

He again joined IISc after his stint at IBM. After having worked on speech recognition, the next level was working on audio-visual speech recognition.

“We speak with gestures, and it is important to understand how gestures can create realistic animation. We have an optitrack motion camera device that can record someone’s gestures when they speak, which can help in understanding speech behaviour,” he explains.

Prasanta receiving honourable mention at MHI Research Festival for paper titled “Processing speech signal using auditory-like filterbank provides least uncertainty about articulatory gestures”

Working on healthcare

Prasanta has also worked with hospitals like NIMHANS, St Johns, Bengaluru, etc. “Using the sound of your voice, we can, for example, try and understand how much the lung is congested. With HCG Hospital, we are trying to understand if you have a problem with your voice box. Many cancer patients have lost their voice box; we are trying to convert their speech into natural speech. Other than that, we are working to detect and improve the condition of patients with neurological problems who have a problem in speaking,” says Prasanta.

Now, he is working on speech recognition and voice technology using AI, ML, and AR. It offers the promise of improving livelihoods, especially in rural parts of India.

However, while India is home to 22 official scheduled languages, and a total of 6,661 mother tongues, leading internet companies in India are currently focusing only on five or six Indian languages.

Although the market is still nascent, the lack of investment in local languages and dialects is one of the fundamental bottlenecks for the growth of voice technology in the country. Prasanta’s project aims at addressing this bottleneck by reaching out to the wider Indian language base and laying the foundation to make it beneficial for the masses.

Advising young techies, Prasanta says, “Find out what you are really passionate about and focus on that. Once you decide to go all out to build and work on your project, find the support of the right people. Anything you do today needs multiple people coming together, and then everything will fall in place.”

Source link


Why You Should Start a Business Only While You Have a Job



Opinions expressed by Entrepreneur contributors are their own.

Many people that I meet tell me that they dream of starting their own . I always ask them, “Then why don’t you?” They typically respond by saying that they have so many financial and personal responsibilities, that they can’t just quit their job to start a company, etc. Then I tell them my story …

Hero Images | Getty Images

Related: How to Use Your Current Job to Start Your Next Business

Continue Reading


How brands can develop a Web3 entry strategy



At an increasing pace, brands are looking for an on-ramp to Web3 to connect with their customers. Whether it be a presence in a virtual world (fast-food chain Wendy’s opened a restaurant in Meta’s Horizon World) or with digital goods (Coca-Cola launched virtual fashion items in Decentraland), companies are experimenting with attracting customers using these new environments.

Often, they’re doing so with a sense of FOMO — fear of missing out — as they race to capture the hearts and minds of Generation Z and Millennial consumers on these emerging platforms. 

The gap between Web3 interest and current experiences presents an opportunity

Our recent survey of over 700 online consumers reveals that they’re indeed interested in using Web3 to interact with companies: 51% said they would be interested in using these technologies to engage with brands. In the same breath, however, consumers say that brands aren’t doing a good job offering Web3 experiences that fully engage them, with 48% agreeing that companies are largely unsuccessful with their current initiatives. 

This finding reveals an opportunity for brands: They can experiment with compelling ways to meet the consumer appetite for Web3 and welcome new customers to their businesses through these new channels. 


Intelligent Security Summit

Learn the critical role of AI & ML in cybersecurity and industry specific case studies on December 8. Register for your free pass today.

Register Now

Developing a Web3 entry strategy today

Although many opportunities exist for brands using Web3, many companies have difficulty defining what kind of experience they want to develop at this stage of the technology. The complexity and cost involved in developing extensive experiences within these environments – including the risk that consumer preferences might suddenly change — have limited many companies’ efforts to experiment.

Non-fungible tokens (NFTs), in particular, can serve brands as an on-ramp to Web3 because they have immediate practical applicability for business. They also contain future utility for other Web3 applications in distributed autonomous organizations (DAOs) and the metaverse. 

The shift from collectible to utility NFTs

In 2021, much of the excitement around NFTs revolved around collecting rare, one-of-a-kind NFTs to post as a profile picture or hold in a digital wallet.

On the brand side of the equation, this manifested itself in companies launching collectible NFT projects that drove buzz around the initiative but largely resulted in little benefit to the collector. Since then, the conversation around NFTs from a brand perspective has shifted from them being used primarily as collectibles to utility NFTs that confer benefits to the holder. 

Our research among consumers reflects this shift.

To date, many companies have experimented with collectible NFTs to drive buzz as part of their Web3 initiatives. However, when it comes to including NFTs as part of the brand experience, customers indicate they would like to see a shift in this strategy. They indicate that utility NFTs (containing additional benefits) drive 5.1% higher purchase intent over the traditional collectible NFTs launched by many companies.

Top-performing utility types for NFTs

Customers also have specific types of utility benefits they’re seeking from NFT-enabled brand programs, and the value that companies can deliver to them as a result.

Consumers say the top benefit they’re looking for in utility NFTs is a way to be rewarded for their brand loyalty, with 37.4% indicating that it increases their brand engagement. Other top benefits users look for in utility NFTs: a way to support organizations that drive social impact (27.8%), a branded community with exclusive offers (26.6%), and a way to obtain event tickets (23.9%).

NFT-enabled brand communities drive exclusive experiences

As the buzz around collectible NFTs fades, the next logical step for companies looking to attract Gen Z and Millennial customers is to build a brand community enabled by utility NFTs.

NFT brand communities can not only attract new customers with digital assets, but can provide added benefits to deliver added engagement and value. By wrapping these benefits in an active brand community powered by NFTs, companies can deliver continual engagement for customers — and earn brand loyalty as a result. 

In these groups, brands can extend the conversation with their customers and deliver special perks, benefits and content to loyal members, such as access to special events, discounted offers and behind-the-scenes interviews. One benefit of these private membership communities is that brands can engage NFT holders in a curated, brand-safe environment.

Specifically for brand communities, our research of 700-plus consumers indicated that certain benefits would make them more likely to become NFT holders, with members-only discounts at the top (43.1%), followed by access to special product features (late check-out at a hotel, for example) (31.5%), and access to exclusive merchandise (30.7%). These types of added, exclusive amenities create additional value for Gen Z and Millennial customers. 

Taking advantage of Web3 today

Web3 technologies offer compelling opportunities for brands to immerse their customers in virtual experiences and to engage in decentralized ways. But because these technologies will take time to mature, many brands view them as available only in the future. In the meantime, they’re experimenting with one-off initiatives to attract Gen Z and Millennial customers on these platforms so as not to miss out on emerging opportunities. 

NFTs offer a practical on-ramp to Web3 for brands who want to experiment with opportunities that NFTs unlock now, as well as future-proof their strategies as virtual worlds mature and develop.

The best news is that companies can start with branded NFT communities today, with web technologies that users are widely adopting. Whether it’s using NFTs as a part of loyalty programs, as social impact initiatives, or for community building, these digital tokens offer a compelling way to attract Gen Z and Millennial customers and keep them loyal to your brand.

Dave Dickson is the founder of PicoNFT.

Source link

Continue Reading


What ‘Everything as Code’ is and why it matters



Simply put, “Everything as Code” (EaC) is a way of managing IT infrastructure and building systems and tools that support modern software applications. It takes the manual processes and activities that people do and turns them into software code so that machines can do those things instead. Anything that teams need to figure out, agree on and control gets documented and “codified” as a configuration file that humans can read, and then machines can execute. 

Imagine if your kitchen could somehow understand your favorite recipe and then automatically choose the right tools for prepping it, the right process for cooking it, and even the right wine and dessert pairings, and then serve that exact meal to you over and over and over, every time you asked for it.  Sounds impossible? …It is. But if your kitchen were a public cloud provider, and your meal was a software application, it’s pretty much exactly what we’re talking about here.  

Everything as Code lets developers tell their cloud providers (or their local systems) exactly what they need in order to “serve up” the perfect application, and then the systems and tools and processes all execute that plan to make it so.

Using development best practices to accelerate time to market

EaC has been as much a cultural shift as a technological one because it completely revolutionized the way developers think about building, deploying and updating software. For example, before “as code,” if, say, a small business needed to run an application, they’d need to take a lot of steps. An IT administrator would order a physical server with the right amount of physical onboard disk, CPU and memory. It would arrive a few weeks later, and the admin would have to install the operating system, configure the kernel for maximum efficiency and then hook the server up to a physical network. All these steps were time-consuming, prone to human error and not easily scalable — and just a few of the things that had to be done before software developers could actually start running their apps.


Intelligent Security Summit

Learn the critical role of AI & ML in cybersecurity and industry specific case studies on December 8. Register for your free pass today.

Register Now

With an “as code” approach, a developer can describe the same infrastructure in a policy configuration file, which tells their chosen cloud provider exactly the right type of server environment to “spin up.” The cloud provider can have it set up in seconds, and development can start immediately. Later, if the developer needs to make a change or move from a test environment to a production environment, they can just modify the file in code, resubmit it and the cloud provider will have it updated in seconds. This increases speed and scale exponentially since machines can execute code far faster than humans can perform tasks, and if done right, it can also eliminate human error and repetitive work.

Popular “as code” examples

Two of the most popular examples of “as code” that are part of the Everything as Code movement are infrastructure as code and policy as code

Infrastructure as code

Modern software runs in a hyper-virtualized environment, which adds complexity but also allows an unparalleled level of control. Application code is run in virtual containers, themselves running on virtual machines, all connected with virtual networking — all of which can be controlled with software code. Today, instead of ordering a server, developers can simply define what their app needs and then submit that request as software code. The cloud platforms execute that code and automatically build the environment that was requested. What is really important about this is that it allows companies to “scale on demand” — they pay for the actual usage at any given time, and they can scale up or down as needed. 

Policy as code

This is when policies are a bunch of rules codified and enforced across different systems. Think of “policy as code” as a set of guardrails that determine what is allowed to happen and what can never happen. Policy is decoupled — or separated — from the app or infrastructure. That way, if a policy needs to be changed, a developer doesn’t have to update — or worry about changing or breaking — the rest of the app or infrastructure. That means you can change the coding for the policy without changing the coding for the app. Open Policy Agent (OPA) is a great example of policy as code — OPA is a general-purpose policy engine that provides a single standard for policy that can be enforced anywhere.

Top three benefits of an everything-as-code approach

When you let humans be creative and think through hard problems, and you let them collaborate, share and imagine, we all know magic can happen. Everything as Code lets humans decide what’s right, and then tasks machines with making that so. That means you get the best out of everything, including: 

  • Repeatability: All processes, policies and descriptions are written down in code, so they are easily replicated. Let’s say a developer working for a global bank wants to set a policy that says, “Only users located in the central U.S. can access business accounts between 9 a.m. and 5 p.m. CT.” If another developer located in Europe wants to implement the same policy, but with an updated time zone, they can easily replicate the policy to do so. This saves the second developer time, frees them from reinventing the wheel, and also means less room for error.
  • Scalability: Defining configuration as code means that systems can scale up and down on demand with little risk of error. And since environments are literally defined in code and can be spun up anywhere, testing gets easier too. Development, testing and production environments can be as close to identical as possible, and lessons learned in one can be applied to the others with policy changes alone. With an “as code” approach, developers can test their changes before they are put into production, reducing the risk of errors and security risks. Automation also frees up developers’ time and allows them to focus on more differentiated work. 
  • Security: When security policy and configuration are moved out of dedicated black boxes, PDFs and team meetings and are instead codified in policy files, teams can treat those policy files just like any other software file. That means they check it in and peer-review it. They iterate on it and implement that security everywhere. It can be rolled forward or back as needed. And, when teams need to prove to auditors that their policy is in compliance, they can easily point to the code. 

When done right, “everything as code” lets teams define what’s right and then lets the systems take it from there. It democratizes the ability to build applications and solve problems, meaning more people can contribute to a better final product.

And, of course, Everything as Code isn’t just about the control of systems. It also takes advantage of the culture of work that software developers have built to minimize errors and maximize satisfaction and productivity. By automating away repetition and fostering collaboration, Everything as Code lets humans focus on new challenges and meaningful work and lets the machines handle the rest.

Tim Hinrichs is CTO and cofounder of Styra.

Source link

Continue Reading