Sunday, March 29, 2020

Should you invest in Natural Language Processing (NLP) technology?

What is the need for Natural Language Processing (NLP) technology in software products?

I think NLP as a basic technology would be needed in all the software products especially in the end user interfaces (eg: Web Site, Mobile Apps, Amazon Echo apps etc). NLP implementation requirement would be like a standard functionality, something  very similar to a monitoring functionality or a multi-language (I18N & L10N) feature in a standard product.

Why do you say that NLP would be an integral part of every IT products? Can you explain few NLP use-cases in IT products?

For answering the need of an integrral usage of NLP, lets first look at the various use-cases of NLP that would be useful in your product and how it can add value to your product.

  1. Chatbots - Every website would need to have this real-time customer interaction tool to filter lot of customer queries before routing to the sales and support team.
  2. Search Engine - Customers and internal teams should be able to have human-like queries to be issued on the knowledge base.
  3. Spam Filtering - Standard filtering, organizing and prioritizing an incoming job task or email. 
  4. Transcription of Audio/Video -  The option of automatically scripting all the human speech (Speech to Text). In the past, this used to a major task for health care transcription for legal purposes, but today, many tools would also want to create the words automatically.
    For example, when uploading to Youtube like application, it should also have subtitle or wants to search someone's speech for certain words and point out where such a term was referenced.
  5. Content or News Curation - This is a very interesitng usecase for industry specific business use case such as identifying some content and then do some processing in the domain of fake news, advertising, market Intelligence, Recruitments, Social Media monitoring etc.
  6. Sentiment Analysis - Organziations can find out what is the emoition of a customer or end-user based on the usage of terms that a customer is using and can take appropriate action based on their emotion.
  7.  Intelligent Conversational Systems for Voice driven applications(Voice Bots): Human to machine interface over voice. Examples are Amazon Alexa, Google OK and Apple Siri.
  8. Automatic Machine Translation -  This can automatically give language translation like Google Translate and also provide computer-assisted Coding of a standard business rules or even coding from one language to another langugue. 
  9. Cognitive Assistant - Personal assistant who will store all your information including your schedules, and also remind you your activities or also recommend you some improvement activities like time to stand up or to sip some water or you need to cool down and not raise your blood pressure/heart beat etc after integrating with your heart beat/BP monitoring application.

Why should you invest in NLP?  

  • As a Product Owner -  If you do not invest right in the NLP technology, your products would be lagging behind your competior solution. If your novice developer has copied some public NLP code/library, the customer experience might be really pathetic and customer might think your NLP interface is too idiotic. (I personally felt this very often with many NLP applications).
  • As a Developer: This is going to be a great opportunity for getting a new job which gives you a competitive advantage over the other developers who does not know NLP technology. If you know the fundamentals of this technology, you can tune your NLP application based on the business usecase and provide valuable contribution to your product and thereby giving a good customer experience and competitive advantage.

Sunday, March 08, 2020

Comparing GoldenGate Kafka Handlers

GoldenGate is the only differentiated product in the market to have 3 different types of adapters to Kafka. The three different connectors to Kafka are: 
1) Kafka Generic Handler (Pub/Sub)
2) Kafka Connect Handler
3) Kafka REST Proxy Handler

Each of these interfaces has its own advantages and disadvantages. The table below compares and contrasts the differences between the above three handlers:
FUNCTIONAL AREA
KAFKA HANDLER (PUB/SUB)
KAFKA CONNECT HANDLER
KAFKA REST PROXY HANDLER

Available in Opensource Apache version

Yes
Yes
No*
Schema Registry Integration

No
Yes*
Yes
Formatting to JSON
Yes, with GoldenGate Formatter
Yes, with Kafka Connect Framework

Yes
Formatting to Avro
Yes, with GoldenGate Formatter

Yes*
Yes*
Designed for high volume throughput
Yes


Yes
No
Transactions and Operations

Yes, Both
Note: transactions have specific challenges, hence not recommended

Operations only
Operations only
Run-time mapping of Key and Topic

Yes
Yes
Yes
Connect Protocol

Kafka Producer API
Kafka Producer API
HTTPS, REST
Web Proxy Support

No
No
Yes
Synchronous(also known as Blocking Mode) and Asynchronous Mode of operation

Yes, both
Note: Synchronous has very low throughput
No, Asynchronous only
No, Asynchronous only
Kafka Interceptor support
Yes
Yes
No

* Available with Confluent Community License and Enterprise License

For detailed information refer to GoldenGate for Big Data Documentation:

Should developers write additional code which is not given in specifications?


The question is can developers write additional code that is not given in specifications. 

The answer to the question can be self-explained as in the image below :

A developer looking at the code would give 2 solutions to fix the above problem:
i)  Instead of using the assignment operator, the developer should have used a comparison operator as "isCrazyMurderingRobot == true"
ii) Use final keyword so that it is unalterable as "static final bool isCrazyMurderingRobert = false; "

But, I think the above two solutions are not the right ones. The problem is that the whole routine is was an unnecessary piece of code which specifies an option to kill(humans) which was not originally expected to be programmed as per the functional specification. A programmer who tried to act smart, but made a syntactical error created the whole mess.

When I was a software developer, I remember asking a product manager whether it is acceptable to write some additional functions (or methods) in the code for some extra validation which was not there in the specification. The answer that I got was an absolute "NO" and he said it will be considered as a "SIN" in the developer's job. Then I asked him why and he explained to me this.  It might be easy to add a new feature into the product by a developer but is humongous difficult to drop a feature that is there in the code. So as a developer, it might take a day to write a few hundreds of lines of code, but it takes years to remove and maintain the code.

Let's take an example of writing a connector code, a simple program that is connecting to MongoDB and as per the proposed certification matrix, it should connect to 3.5 and 3.6 versions. As a developer, you might have been proactive and added an additional check of Mongo DB version in the code. What happens with this additional check is that if the customer chooses to upgrade the MongoDB to 4.0, your code will stop to work and would require a patch to make it run. If the check was not there, it would have been a simple sanity test on your automation suite to certify the same old code with MongoDB 4.0 as well.

If you have a high urge to write that code, write it in a private branch or commented section as  proactive code that may be required in for the future.

In summary, it is a professional cardinal "SIN" to add additional code into a product mainline without a Product Manager's approval.