Wednesday, November 04, 2020

Interviews from the aspect of a tech guy


For the last 12 years I had the opportunity to interview quite a few people for various positions mostly related to Node.js, Java, Spring, Cloud and SQL processing. I would like to take this opportunity to write about what I was looking for in the candidates resumes, recruiters relationships and some common mistakes that people make during the interviews or writing resumes. Please note that this is strictly my opinion and that you can take this with a grain of salt.

Pre-interview, gathering requirements

On the several projects where I worked or where I am still working, most job openings will come from business, financing the needs of a project that is, where there is a specific business requirement that needs to be fulfilled. These requirements are created from ad-hoc need or opportunity, or strategic planning. 
There can be two forms of financing in general that I observed, one where we are looking to initialize the platform or novel way of processing data, and one where we are utilizing existing or expanding platform for the business need or competitive advantage. The difference is that first one is developing supporting technology and second one is developing the business. This is where some ideas of what kind of experience and profiles we would require from the candidates come to mind. The other important factor is the company and team culture. Some teams are conservative some more open, some follow strict structure and the others have more free form. These are all important factors for the success of the team. Remember, most of the time, the goal is usually to establish an effective team not exceptional individuals.

Thoughts about candidates

One of the things with agencies is that they use automated search engines when looking for the candidates most of the time and some people tend to add things into the keywords or their job description that they perhaps only heard of or just have bare understanding about them. We get candidates of different backgrounds, both cultural and technical and our goal should always be to see and understand how they can fit into our team and business culture and whether they will bring something valuable in terms of their personal and professional skills and their knowledge of technology. One of the most important aspect for the candidate is to show their enthusiasm and professionalism during this process, regardless on how they actually feel towards the process or the interviewer. This proves couple of points:
  1. That they are capable of desired behavior when needed (professional and courteous)
  2. That they understand the rules of engagement and what is expected of them
  3. That they are not shy to enter into the new situation and do their best to fit in
Anything else leaves employer side at the mercy of the moment and unnecessary training where we should be focusing on bringing candidates up to the speed of our business and not basic understanding of the general work ethics and their professional skills.

Unfortunately, many times candidates will come unprepared or uninterested to the interview. There is a difference when someone is a bit scared or overwhelmed and when someone is unprepared. It is fully expected that candidate will know their resume in detail (what is written in the resume). This is why writing resumes that are more than 2 pages long is very wrong in my opinion. I would usually not go over second page for several reasons:
  1. It is out of date (I am only interested in last 5 years at most)
  2. Technology used may be outdated
  3. Candidate may not have relevant knowledge to apply in the current scenario
  4. I have many resumes to review and only limited time to do this
  5. Many times candidates would say that they do not remember details of the implementation past 5 years and this does not help in this line of business where technology changes all the time 
It is, of course, good that candidate can reference their past experience and what they learned from it and how they evolved professionally, but details of the discussion should be kept at their most recent experience. There are many ifs and buts here, however, I need to make sure we establish a reasonable expectation of what is needed to pass the interview during the one hour process. This is a very limited time to get to know someone you have met for the first time and establish both objective and subjective opinion of the candidate. 

Reviewing resumes

As I have mentioned before, it is well advised to contact some of the agencies that professionally craft resumes towards the job. We all have various experiences and they do not always apply to every company or every job. The idea of a resume is to get invited to the interview in most focused and honest way and it is not the point to write you life story in there. Professional agents and tools that they use have a very limited time to go over the resumes especially for the job where hundreds of applicants apply. You must point out in your resume why you are the right person for the job requested even though in many cases you will not be a 100% match. You need to single out point from your career and experience that puts you on top over other candidates, or in other words, how to be better than the second last candidate. Once you get an interview, resume is not that much important and serves only as a reference for the conversation. As far as cover letters goes, I honestly do not remember when I last read the cover letter, so I would generally drop these. 

When reviewing resume, it always depends if I am reviewing for position of full time job or a contractor. The difference is that first position will have time to build on a career and experience and the later needs to perform immediately. Criteria may be a bit different for two types but it entirely depends on hiring manager and company that is hiring.  

Unless you are not the person that graduated recently, I would generally put very little emphasis on the education other than passing requirement for the job, but more one experience and enthusiasm to do the job. I would prefer to see more resumes where candidate has the experience in the open source community or had some recent (relevant) courses completed. In my opinion this would give an edge to this candidate over the rest. Continuous education and participation in your own or open source project in this line of work is extremely important. If candidate has a blog or example of work, even better. Those are the points that would put candidates way above over the rest that do not have this, providing that what is displayed demonstrates capability and is in line with the resume and job requirements.

The interview

On the interview day, whether it is online or in person, please do not be late. This is not a cliche but shows some responsibility and respect towards the people involved. In case you are being late for objective reasons, please pick up a phone and give interviewer a call, it will be appreciated. Later on this also shows that candidate understands importance of not being late to the many meetings that we may have. 
During the interviews, sometimes I interview alone and sometimes there is hiring manager or another senior resource present, but it depends on the role we are interviewing and our availability. In any case there is going to be different sets of questions designed to:
  1. Validate candidate resume and expertise
  2. Establish effective communication
  3. Validate behavior under stress
  4. Establish problem solving capacity
When interview starts, we always give a chance to candidates to introduce themselves. This can take anywhere from 5-10 minutes after we give a short introduction of our environment and job requirement. Expectation is that candidate already understands something about company and tools/processes that we are using, or how would they get the interview in the first place, right? This also shows that candidate is interested in this company/position and that they did some research prior to the interview. 
All my interviews are technical, and that means that we will talk about technology and processes that candidate used and that are relevant to us. I definitely do not know and can discuss about all the items on everyone's resumes, but based on my previous experience, there will be a lot of them where we can find common ground for the discussion. 
Regardless of the candidate I would always go through basic questions to get an understanding if the candidate thought about technologies they are using and not just copy pasted from the Internet. This would involve simple questions like: Why do you think there is an abstract class and interface in Java, for example, or tell me a difference between Java and JEE? You would be amazed how many people would struggle with this question even though this is a very foundation of how design patterns and Java coding is performed in general. After we cover basics just to get the idea where candidate stands in regards to what is written in the resume, we would get on relevant technologies that are listed as most recent or marked as expert or experienced. I would establish a and ask about scenario with the expectation from candidate to:
  1. Demonstrate expertise in the area
  2. Establish communication and problem solving capability
  3. Provide a suggestion and see how candidate can follow a given direction
Depending on a candidate and their previous responses I may provide false leads or conflicting narrative and ask the candidate to proceed in that way to establish the possibility of a future conflict or behavior problem. This is all done in respectable way, of course, with the expectation from the candidate to politely and argumentatively point to the error. In several cases this proved to be impossible obstacle for few candidates... In general, and depending on a job, there are 3 major layers in the enterprise application. Front end, middle tier and database. There are hundreds of technologies that can go in between, like caching, routing, messaging, ETL, etc., but the important factor is that every candidate needs to be aware of these differences and how applications perform in general. When we discuss the problem, awareness of what is being applied as a solution needs to be present at all times. 
One of the important suggestions during the interview is that you should not start on a subject that you do not fully understand when giving examples. This will lead to more in depth questions and possible dead end. If a question pops up that you are not familiar with, do try to supplement with closest example from your experience. This will set you on familiar ground where we can make a constructive conversation. Having a good exchange of ideas is considered a good interview in my opinion. And remember, interview is not an interrogation, look on it more as a  focused conversation.

Post-interview thoughts

Once the interview has been completed, normally we would gather to discuss the candidates and their performance. I would generally present a written or verbal report and provide my opinion, but as a general rule, decision on hiring the candidates usually rests with the Director or VP or whoever owns the budget for the project.  

In any case, I have been in various interviews during my life where some were good and the others not so much. Not everything turns always as you expected, but in the end, you only need one good interview, and with some luck, knowledge and positive attitude, you will get your next job, or at least, a valuable experience be it positive or negative.

As always, please share your thoughts and opinions about this matter. All the best!

Wednesday, October 28, 2020

Proxy in Microsoft proprietary world

In a world where we have a corporate proxy server requiring NTLM authentication, having Mac/Linux may prove to be difficult choice if all data needs to be routed though the said proxy. Talk about Brew, Git, Node, etc., and all of them will require authenticated access. When you try to use these apps, you may get 407 error saying that you require authentication. In this case, we would need to authenticate using NTLM and this can easily be done with cntlm. We can leave proxy running and connect to its default port 3128 on localhost for most applications requiring it, be it through environment export and http or https proxy or config in e.g. gitconfig file. To start, we need to generate hash for our password:

cntlm -u myusername -d mydomain -H

After this, you will get a password hash that you need to copy in cntlm.conf file in /etc and this will be used to start your server and authenticate you. Server is started with cntlm command specifying configuration file. A good resource to look at is cntlm documentation,  or I found that this post also helps.


MSSQL to MongoDB

There have been a lot of discussion in recent years regarding NoSQL databases and when would they be preferable to SQL databases. There are a lot of articles written on this subject, but I wanted to give some insight to one of my past projects that I have been part of, and provide my perspective on the subject.

When choosing topology for the system we should take in consideration all of the factors and use technology that is most suitable for the given task. IMHO, there is no technology which would be optimal for all kinds of scenarios in a complex enterprise system. I have heard arguments such as, that if there is a limited support in one database, for e.g. NoSQL processing, then it should be a viable option to consider just for the sake of it being already present in the ecosystem and for the purpose cutting down the costs of the initial deployment. I would rather disagree with this as I believe that we need to expand our thinking and use technologies that are built specifically for the task at hand as cost savings in speed and development further down the road should not be overlooked. Benefits of this approach are many and cannot only be calculated by initial deployment costs. There are also good articles on the somewhat opposing views [link1] if you would like an honest debate on this matter.

In one of our use cases, MSSQL was already present and deployed in the cloud and initial decision was made to use this to manipulate and store NoSQL data. Even though MSSQL has support to deal with NoSQL structures, they are stored as string that had to be continuously converted to table format (or use special functions) to have a full range of capabilities for e.g. PSF (paging, sorting and filtering) and any serious and frequent data updates. (Here is the guide from Microsoft regarding SQL/NoSQL on Azure in that matter).

A better choice, IMHO, for JSON structures would be e.g. MongoDB or Cosmos DB, depending what is available in your current infrastructure. MongoDB was choice for database due to more familiarity in the development team and the fact that we could deploy our instances to both public and private cloud relatively easily with open source version of the database.

What was gained is that MongoDB is already optimized to deal with JSON structures, fully supports PSF on driver level and it is extremely easy to setup and maintain. We decided to start with SSL connection on 3 replication nodes. We also decided to save on development environment and deploy 3 nodes on the same server (for prod this should be distributed).

In our case MongoDB was being used as a cache database to a secondary layer of APIs that were supported with Oracle database in business API layer. Since we were looking for more flexibility and increased performance, this was a good choice. Data arriving from business layer was already well structured as one JSON but due to size and GUI editing capabilities, we needed to break it down to offer a more flexible usage based on the given business requirements for sample GUI.

Our API for Mongo layer was written using Spring Boot and was previously designed to work with Hibernate and MSSQL. There was a lot of business logic generated and it was handled with Maps and Strings without explicit Java mappings in many cases. Yes, there were some usages to map certain objects using JSON parser but it was all done manually. To proceed, we needed to remove Hibernate, generate 2 additional sets of domain objects (VO->Mongo->Business API), write converters (e.g. Orika) and enhance business logic to avoid parsing to HashMaps but using MongoDB drivers to map directly to Java objects. We also gained ability to use projections, aggregations and MongoDB views. There was a portion of data in MSSQL that was extensively designed to use relations (and this was on top of cached API data) that we needed to convert to NoSQL and integrate into new collections. Removing relations and designing collections proved to be a tricky part as we did not want to change our Angular application extensively. Business requirement was that other than paging changes no other visible functionality can be upgraded, and performance had to significantly improve. MongoDB 4 came with support for transactions between collections and even though ideal usage of NoSQL is to contain everything within one collection, we could not afford to do this due to several factors. One was changing GUI extensively, second one is that we still could not lose concept of relation that was introduced (to some extent), third one was size of the payload if we kept everything under same JSON and the last is performance issues on Angular side due to parsing speed. Perspective of SQL is speed between tables, correctness of data and data safety and the second is ease of use and practicality. Setting up NoSQL vs SQL database engine if also secondary benefit of NoSQL as it is much easier to tune it up (at least from the aspect of what we were doing). Lastly, scaling is much easier to accomplish with NoSQL.

Creating collections, from SQL to NoSQL

One of the challenging aspects of moving from SQL to NoSQL is to design appropriate data storage considering everything already implemented and respect the best practices of the underlying technology. In SQL we have relations and normalization and NoSQL is quite opposite where we would ideally want to contain as many aspects of a request in a single collection. The thing is that we also need to consider how much code is there using already confirmed contracts and APIs. If we have ESB layer or gateway, we may use this to bridge some of the gaps, but for some APIs, to fully gain better performance, smaller corrections may be needed on both server and the client side. In our case, client was missing pagination, consistent contract definition and sort and filtering capabilities were inconsistent. One of the first things we did was to collaborate with the team to understand the benefits regarding performance and page navigation by looking into information being queried and paginated. Since pagination can be done on a driver level, this was our initial goal. There is a secondary option of slicing arrays within one document, but this was not preferred approach. Next problem was dealing with huge payload and frequent updates to API. Older browsers had difficulties parsing this content continuously. Situation was that we have huge and diverse user base with different technical capabilities. Payload delivery needed to be carefully calculated to provide business value with already crafted GUI components, but also to keep in mind performance. We absolutely could not count on fact that the customers will always work with up to date computers and browsers and that they will all have high speed network access. 

As I mentioned earlier, MongoDB 4 included transaction between collections so this significantly helped with our restructuring. We, of course, tried not to misuse this, as NoSQL philosophy is not to build relational collections. Reference data, cached data, configuration data all found its way to new collections and as they were separately accessed, so this was not the issue. Main data got separated into three main collections keeping in mind business flows and GUI design. Looking back at the end goal, I believe that it all worked out well. 

Last thing to do was to create indexes based on frequently accessed data elements, add few aggregations for different views and create several views to serve data for the future use.

Changes to API contracts

Changes to API contract were done to standardize API development and exposure of the data to the client, introduce paging, sorting and filtering in consistent way and to reorganize some of the APIs to better serve NoSQL data (all based on our changes to the collections). These changes were organized with client usage in mind. The question that we continuously asked ourselves was, how will the client get appropriate amount of data with good flexibility and least amount of interaction with the APIs. Network quality that is used varies through the world and the network is also quite huge. This all plays into performance enhancements that we were bringing to the solution. One of the main things was to restructure models used to have server side models separated from client side models. We also introduced abstraction layer in the business service layer to help with future changes.

Updating Angular side

Apart for the various performance changes, modifications for API changes were relatively straight forward (and time consuming). API was split into generic configuration and task engine, and paged data was added where we used native MongoDB paging functionality. We also added sorting and filtering as opposed to SQL string store. MongoDB was able to process this without any performance hits. Instead of one API call to back-end service we chained  two or more services, usually one un-paged for header data and paged services for business data. This worked well and gave us much smaller objects to deal with and also improved performance by a long shot. Ultimately, changes to Angular that were required were as listed (this included performance updates and MongoDB updates for the API):

  • Introduction of modular children routes
  • Introduction of lazy loading modules
  • Introduction of PSF functionality based on native database support to execute such queries
  • Reduction of the payload size by remodeling data as a cache and introducing concept that we should process and load only data that user can see + 1 page of buffered data
  • Moving cross field validation logic to the API since only validation logic on the client side should be applicable to fields that are visible to the user.

The end result of all of these operations was vastly improved performance, simplified development, maintenance and solid platform for future operations.